GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  XMLTV::AUGMENT (3)

.ds Aq ’

NAME

XMLTV::Augment - Augment XMLTV listings files with automatic and user-defined rules.

CONTENTS

DESCRIPTION

Augment an XMLTV xml file by applying corrections (fixups) to programmes matching defined criteria (rules).

Two types of rules are actioned: (i) automatic, (ii) user-defined.

Automatic rules use pre-programmed input and output to modify the input programmes. E.g. removing a title where it is repeated in a sub-title (e.g. Horizon / Horizon: Star Wars), or trying to identify and extract series/episode numbers from the programme"s title, sub-title or description.

User-defined rules use the content of a rules file which allows programmes matching certain user-defined criteria to be corrected/enhanced with the user data supplied (e.g. adding/changing categories for all episodes of Horizon, or fixing misspellings in titles, etc.)

By setting appropriate options in the config file, the rules file can be automatically downloaded using XMLTV::Supplement.

EXPORTED FUNCTIONS

<B>B><B>setEncodingB><B>B> Set the assumed encoding of the rules file.
<B>B><B>inputChannelB><B>B> Store each channel found in the input programmes file for later processing by stats.
<B>B><B>augmentProgrammeB><B>B> Augment a programme using (i) pre-determined rules and (ii) user-defined rules. Which rules are processed is determined by the options set in the config file.
<B>B><B>printInfoB><B>B> Print the lists of actions taken and suggestions for further fixups.
<B>B><B>endB><B>B> Do any final processing before exit (e.g. close the log file).

INSTANTIATION



    new XMLTV::Augment( { ...parameters...} );



Possible parameters:
rule => filename of file containing fixup rules (if omitted then no user-defined rules will be actioned) (overrides auto-fetch Supplement if that is defined; see sample options file)
config => filename of config file to read (if omitted then no config file will be used)
encoding => assumed encoding of the rules file (default = UTF-8)
stats => whether to print the audit stats in the log (values = 0,1) (default = 1)
log => filename of output log (default = augment.log)
debug => debug level (values 0-10) (default = no debug)
note debug level > 3 is not likely to be of much use unless you are developing code

TYPICAL USAGE



 1) Create the XMLTV::Augment object
 2) Pass each channel to inputChannel()
 3) Pass each programme to augmentProgramme()
 4) Tidy up using printInfo() & end()

        #instantiate the object
        my $augment = new XMLTV::Augment(
                  "rule"       => "myrules.txt",
                  "config"     => "myconfig.txt",
                  "log"        => "augment.log",
                  );
        die "failed to create XMLTV::Augment object" if !$augment;
       
        for each channel... {
          # store the channel details
          $augment->inputChannel( $ch );
        }
       
        for each programme... {
          # augmentProgramme will now do any requested processing of the input xml
          $prog = $augment->augmentProgramme( $prog );
        }
       
        # log the stats
        $augment->printInfo();

        # close the log file
        $augment->end();



Note: you are responsible for reading/writing to the XMLTV .xml file; the package will not do that for you.

RULES

<B>remove_duplicated_new_title_in_epB> Rule #A1

Remove New $title : from <sub-title>



  If sub-title starts with "New" + <title> + separator, then it will be removed from the sub-title
  "separator" can be any of .,:;-

  in : "Antiques Roadshow / New Antiques Roadshow: Doncaster"
  out: "Antiques Roadshow / Doncaster"



<B>remove_duplicated_title_and_ep_in_epB> Rule #A2

Remove duplicated programme title *and* episode from <sub-title>



  If sub-title starts with <title> + separator + <episode> + separator + <episode>, then it will be removed from the sub-title
  "separator" can be any of .,:;-

  in : "Antiques Roadshow / Antiques Roadshow: Doncaster: Doncaster"
  out: "Antiques Roadshow / Doncaster"



<B>remove_duplicated_title_in_epB> Rule #A3

Remove duplicated programme title from <sub-title>



  i) If sub-title starts with <title> + separator, then it will be removed from the sub-title
  ii) If sub-title ends with separator + <title>, then it will be removed from the sub-title
  iii) If sub-title starts with <title>(...), then the sub-title will be set to the text in brackets
  iv) If sub-title equals <title>, then the sub-title will be removed
  "separator" can be any of .,:;-

  in : "Antiques Roadshow / Antiques Roadshow: Doncaster"
  out: "Antiques Roadshow / Doncaster"

  in : "Antiques Roadshow / Antiques Roadshow (Doncaster)"
  out: "Antiques Roadshow / Doncaster"
 
  in : "Antiques Roadshow / Antiques Roadshow"
  out: "Antiques Roadshow / "



<B>update_premiere_repeat_flags_from_descB> Rule #A4

Set the <premiere> element and remove any <previously-shown> element if <desc> starts with Premiere. or New series. Remove the Premiere. text. Set the <previously-shown> element and remove any <premiere> element if <desc> starts with Another chance or Rerun or Repeat

<B>check_potential_numbering_in_textB> Rule #A5

Check for potential series, episode and part numbering in the title, episode and description fields.

<B>extract_numbering_from_titleB> Rule #A5.1

Extract series/episode numbering found in <title>.

<B>extract_numbering_from_episodeB> Rule #A5.2

Extract series/episode numbering found in <sub-title>.

<B>extract_numbering_from_descB> Rule #A5.3

Extract series/episode numbering found in <desc>.

<B>make_episode_from_part_numbersB> Rule #A6

If no <sub-title> then make one from part numbers.



  in : "Panorama / "  desc = "Part 1/2..."
  out: "Panorama / Part 1 of 2"



<B>process_user_rulesB> Rule #user

Process programme against user-defined fixups

The individual rules each have their own option to run or not; consider this like an on/off switch for all of them. I.e. if this option is off then no user rules will be run (irrespective of any other option flags).

<B>process_non_title_infoB> Rule #1

Remove specified non-title text from <title>.



  If title starts with text + separator, then it will be removed from the title
  "separator" can be any of :;-

  rule: 1|Python Night
  in : "Python Night: Monty Python - Live at the Hollywood Bowl / "
  out: "Monty Python - Live at the Hollywood Bowl / "



<B>process_demoted_titlesB> Rule #11

Promote demoted title from <sub-title> to <title>.



  If title matches, and sub-title starts with text then remove matching text from sub-title and move it into the title.
  Any text after separator in the sub-title is preserved. separator can be any of .,:;-

  rule: 11|Blackadder~Blackadder II
  in : "Blackadder / Blackadder II: Potato"
  out: "Blackadder II / Potato"



<B>process_replacement_titles_descB> Rule #10

Replace specified <title> / <sub-title> with title/episode pair supplied using <desc>.



  If title & sub-title match supplied data, then replace <title> and <sub-title> with new data supplied.

  rule: 10|Which Doctor~~Gunsmoke~Which Doctor~Festus and Doc go fishing, but are captured by a family that is feuding with the Haggens.
  in : "Which Doctor / " desc> = "  Festus and Doc go fishing, but are captured by a family that is feuding with the Haggens. ..."
  out: "Gunsmoke / Which Doctor"



<B>process_replacement_titles_episodesB> Rule #8

Replace specified <title> / <sub-title> with title/episode pair supplied.



  If title & sub-title match supplied data, then replace <title> and <sub-title> with new data supplied.

  rule: 8|Top Gear USA Special~Detroit~Top Gear~USA Special
  in : "Top Gear USA Special / Detroit"
  out: "Top Gear / USA Special"
 
  rule: 8|Top Gear USA Special~~Top Gear~USA Special
  in : "Top Gear USA Special / "
  out: "Top Gear / USA Special"
    or
  in : "Top Gear USA Special / 1/6."
  out: "Top Gear / 1/6. USA Special"



<B>process_mixed_title_subtitleB> Rule #2

Extract sub-title from <title>.



  If title starts with text + separator, then the text after it will be moved into the sub-title
  "separator" can be any of :;-

  rule: 2|Blackadder II
  in : "Blackadder II: Potato / "
  out: "Blackadder II / Potato"



<B>process_mixed_subtitle_titleB> Rule #3

Extract sub-title from <title>.



  If title ends with separator + text, then the text before it will be moved into the sub-title
  "separator" can be any of :;-

  rule: 3|Storyville
  in : "Kings of Pastry :Storyville / "
  out: "Storyville / Kings of Pastry"



<B>process_reversed_title_subtitleB> Rule #4

Reverse <title> and <sub-title>



  If sub-title matches the rules text, then swap the title and sub-title

  rule: 4|Storyville
  in : "Kings of Pastry / Storyville"
  out: "Storyville / Kings of Pastry"



<B>process_replacement_titlesB> Rule #5

Replace <title> with supplied text.



  If title matches the rules text, then use the replacement text supplied

  rule: 5|A Time Team Special~Time Team
  in : "A Time Team Special / Doncaster"
  out: "Time Team / Doncaster"



This is the one which you will probably use most. It can be used to fix most incorrect titles - e.g. spelling mistakes; punctuation; character case; etc.

<B>process_subtitle_remove_textB> Rule #13

Remove specified text from <sub-title> for a given <title>.



  If sub-title starts with text + separator, or ends with separator + text,
  then it will be removed from the sub-title.
  "separator" can be any of .,:;- and is optional.

  rule: 13|Time Team~A Time Team Special
  in : "Time Team / Doncaster : A Time Team Special "
  out: "Time Team / Doncaster"



<B>process_replacement_episodesB> Rule #7

Replace <sub-title> with supplied text.



  If sub-title matches the rules text, then use the replacement text supplied

  rule: 7|Time Team~Time Team Special: Doncaster~Doncaster
  in : "Time Team / Time Team Special: Doncaster"
  out: "Time Team / Doncaster"



<B>process_replacement_ep_from_descB> Rule #9

Replace <sub-title> with supplied text when the <desc> matches that given.



  If sub-title matches the rules text, then use the replacement text supplied

  rule: 9|Heroes of Comedy~The Goons~The series celebrating great British comics pays tribute to the Goons.
  in : "Heroes of Comedy / "
  out: "Heroes of Comedy / The Goons"
    or
  in : "Heroes of Comedy / Spike Milligan"
  out: "Heroes of Comedy / The Goons"



<B>process_replacement_genresB> Rule #6

Replace <category> with supplied text.



  If title matches the rules text, then use the replacement category(-ies) supplied
  (note ALL existing categories are replaced)

  rule: 6|Antiques Roadshow~Entertainment~Arts~Shopping
  in : "Antiques Roadshow / " category "Reality"
  out: "Antiques Roadshow / " category "Entertainment" + "Arts" + "Shopping"



<B>process_replacement_film_genresB> Rule #12

Replace Film/Films <category> with supplied text.



  If title matches the rules text and the prog has category "Film" or "Films", then use the replacement category(-ies) supplied
  (note ALL categories are replaced, not just "Film")

  rule: 12|The Hobbit Special~Entertainment~Interview
  in : "The Hobbit Special / " category "Film" + "Drama"
  out: "The Hobbit Special / " category "Entertainment" + "Interview"



<B>process_translate_genresB> Rule #14

Replace <category> with supplied value(s).



  If category matches one found in the prog, then replace it with the category(-ies) supplied
  (note any other categories are left alone)

  rule: 14|Soccer~Football
  in : "Leeds v Arsenal" category "Soccer"
  out: "Leeds v Arsenal" category "Soccer"
 
  rule: 14|Adventure/War~Action Adventure~War
  in : "Leeds v Arsenal" category "Adventure/War"
  out: "Leeds v Arsenal" category "Action Adventure" + "War"



AUTHOR

Geoff Westcott, honir.at.gmail.dot.com, Dec. 2014.

This code is based on the fixup method/code defined in tv_grab_uk_rt grabber and credit is given to the author Nick Morrott.

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 AUGMENT (3) 2016-04-03

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.