This option is a deterministic way to control the flow of articles and to
split a feed. The hashfeed parameter must be in the form value/mod
or start-end/mod. The Message-ID of each article is hashed using MD5,
which results in a 128-bit hash. The lowest 32 bits are then taken
by default as the hashfeed value (which is an integer). If the hashfeed
value modulus mod plus one equals value or is between start
and end, pullnews will feed the article. All these numbers must
You can use an extended syntax of the form value/mod:offset or start-end/mod:offset (using an underscore _ instead of a colon : is also recognized). As MD5 generates a 128-bit return value, it is possible to specify from which byte-offset the 32-bit integer used by hashfeed starts. The default value for offset is :0 and thirteen overlapping values from :0 to :12 can be used. Only up to four totally independent values exist: :0, :4, :8 and :12.
Therefore, it allows to a generate a second level of deterministic distribution. Indeed, if pullnews feeds 1/2, it can go on splitting thanks to 1-3/9:4 for instance. Up to four levels of deterministic distribution can be used.
The algorithm is compatible with the one used by Diablo 5.1 and up.
|-b fraction||Backtrack on server numbering reset. Specify the proportion (0.0 to 1.0) of a groups articles to pull when the servers article number is less than our high for that group. When fraction is 1.0, pull all the articles on a renumbered server. The default is to do nothing.|
|-B||Feed is header-only, that is to say pullnews only feeds the headers of the articles, plus one blank line. It adds the Bytes: header field if the article does not already have one, and keeps the body only if the article is a control article.|
Normally, the config file is stored in pullnews.marks in pathdb
when pullnews is run as the news user, or otherwise in the running
users home directory. If -c is given, config will be used as
the config file instead. This is useful if youre running pullnews
as a system user on an automated basis out of cron or as an individual
user, rather than the news user.
See CONFIG FILE below for the format of this file.
|-C width||Use width characters per line for the progress table. The default value is 50.|
|-d level||Set the debugging level to the integer level; more debugging output will be logged as this increases. The default value is 0.|
|-f fraction||This changes the proportion of articles to get from each group to fraction and should be in the range 0.0 to 1.0 (1.0 being the default).|
|-F fakehop||Prepend fakehop as a host to the Path: header of articles fed.|
|-g groups||Specify a collection of groups to get. groups is a list of newsgroups separated by commas (only commas, no spaces). Each group must be defined in the config file, and only the remote hosts that carry those groups will be contacted. Note that this is a simple list of groups, not a wildmat expression, and wildcards are not supported.|
|-G newsgroups||Add the comma-separated list of groups newsgroups to each server in the configuration file (see also -g and -w).|
|-h||Print a usage message and exit.|
|-H headers||Remove these named headers (colon-separated list) from fed articles.|
|-k checkpt||Checkpoint (save) the config file every checkpt articles (default is 0, that is to say at the end of the session).|
|-l logfile||Log progress/stats to logfile (default is stdout).|
Feed an article based on header matching. The argument is a number of
whitespace-separated tuples (each tuple being a colon-separated header and
regular expression). For instance:
specifies that the article will be passed only if the Hdr1: header matches regexp1 and the Hdr2: header does not match regexp2. Besides, if the Hdr3: header matches regexp3, that header is removed; and if the Hdr4: header does not match regexp4, that header is removed.
|-M num||Specify the maximum number of articles (per group) to process. The default is to process all new articles. See also -f.|
|-n||Do nothing but read articles -- does not feed articles downstream, writes no rnews file, does not update the config file.|
|-N timeout||Specify the timeout length, as timeout seconds, when establishing an NNTP connection.|
|-O||Use an optimized mode: pullnews checks whether the article already exists on the downstream server, before downloading it. It may help for huge articles or a slow link to upstream hosts.|
|-p port||Connect to the destination news server on a port other than the default of 119. This option does not change the port used to connect to the source news servers.|
|-P hop_limit||Restrict feeding an article based on the number of hops it has already made. Count the hops in the Path: header (hop_count), feeding the article only when hop_limit is +num and hop_count is more than num; or hop_limit is -num and hop_count is less than num.|
|-q||Print out less status information while running.|
|-Q level||Set the quietness level (-Q 2 is equivalent to -q). The higher this value, the less gets logged. The default is 0.|
|-r file||Rather than feeding the downloaded articles to a destination server, instead create a batch file that can later be fed to a server using rnews. See rnews(1) for more information about the batch file format.|
|-R||Be a reader (use MODE READER and POST commands) to the downstream server. The default is to use the IHAVE command.|
|-s to-server[:port]||Normally, pullnews will feed the articles it retrieves to the news server running on localhost. To connect to a different host, specify a server with the -s flag. You can also specify the port with this same flag or use -p.|
|-S max-run||Specify the maximum time max-run in seconds for pullnews to run.|
|-t retries||The maximum number (retries) of attempts to connect to a server (see also -T). The default is 0.|
|-T connect-pause||Pause connect-pause seconds between connection retries (see also -t). The default is 1.|
|-w num||Set each groups high water mark (last received article number) to num. If num is negative, calculate Current+num instead (i.e. get the last num articles). Therefore, a num of 0 will re-get all articles on the server; whereas a num of -0 will get no old articles, setting the water mark to Current (the most recent article on the server).|
|-x||If the -x flag is used, an Xref: header is added to any article that lacks one. It can be useful for instance if articles are fed to a news server which has xrefslave set in inn.conf.|
|-z article-pause||Sleep article-pause seconds between articles. The default is 0.|
|-Z group-pause||Sleep group-pause seconds between groups. The default is 0.|
The config file for pullnews is divided into blocks, one block for each remote server to connect to. A block begins with the host line (which must have no leading whitespace) and contains just the hostname of the remote server, optionally followed by authentication details (username and password for that server). Note that authentication details can also be provided for the downstream server (a host line could be added for it in the configuration file, with no newsgroup to fetch).
Following the host line should be one or more newsgroup lines which start with whitespace followed by the name of a newsgroup to retrieve. Only one newsgroup should be listed on each line.
pullnews will update the config file to include the time the group was last checked and the highest numbered article successfully retrieved and transferred to the destination server. It uses this data to avoid doing duplicate work the next time it runs.
The full syntax is:
<host> [<username> <password>] <group> [<time> <high>] <group> [<time> <high>]
where the <host> line must not have leading whitespace and the <group> lines must.
A typical configuration file would be:
# Format group date high data.pa.vix.com rec.bicycles.racing 908086612 783 rec.humor.funny 908086613 18 comp.programming.threads nnrp.vix.com pull sekret comp.std.lisp
Note that an earlier run of pullnews has filled in details about the last article downloads from the two rec.* groups. The two comp.* groups were just added by the user and have not yet been checked.
The nnrp.vix.com server requires authentication, and pullnews will use the username pull and the password sekret.
pathbin/pullnews The Perl script itself used to pull news from upstream servers and feed it to another news server. pathdb/pullnews.marks or ~/pullnews.marks The default config file. It is stored in pullnews.marks in pathdb when pullnews is run as the news user, or otherwise in the running users home directory.
pullnews was written by James Brister for INN. The documentation was rewritten in POD by Russ Allbery <email@example.com>.
Geraint A. Edwards greatly improved pullnews, adding no more than 16 new recognized flags, fixing some bugs and integrating the backupfeed contrib script by Kai Henningsen, adding again 6 other flags.
$Id: pullnews.pod 9767 2014-12-07 21:13:43Z iulius $
|INN 2.6.0||PULLNEWS (1)||2015-09-12|