softwareupdate.tv
Home Home Contact us Contact us Resources Resources Add your Manuals Add your Manuals Remove your Manuals Remove your Manuals

 
 
TextPipe: Advanced features includes task bar menu display and processing of extracted information
 
Buy TextPipe Online!

Buy TextPipe Online!

Advanced Features

Task Bar Menu

The task bar shows a small TextPipe icon while TextPipe is running.

Task Bar

By right clicking on the icon, you get a menu of options. The options below the line are filters that you can run.

Task Bar Menu

Typically the filters process text that is on the clipboard. This is so that you can copy text from your editor or Word Processor to the clipboard, run a TextPipe filter via the taskbar icon, and then paste the converted text back into your application.

Logging

The filter options pane can be reached by selecting the topmost node in the filter tree. In the registered version it allows you to set logging profiles. Logging allows you to track when a filter was run, who ran it, how long it ran for, which files it processed or skipped, the input size and output size of each file and the changes made to it by each filter. It also keeps track of any errors that occurred during a filter job.

cf card recovery unerase pictures purchase orders software
corrupted memory card recovery corrupted data recovery freeware undelete software
picture undelete usb file undelete windows ntfs undelete

There are three levels of log entry - Info, Warning and Error. Info lines are reported in the Results tab of the Status Window, and Warnings and Errors are reported in the Errors tab. All three are written to the log file. Logged errors modify TextPipe's exit code.

Logging Options

What is Data Mining?

  • Data mining or text mining is where a source of information is processed to extract information.
  • Process a web site to extract product catalog and cost information. This can then be used to compare prices between different suppliers.
  • Process a web site to extract email addresses or web URLs.
  • Harvest the data on a web site for your own purposes.
  • Extracted data is designed to be easily loaded into a database for further analysis.

How can TextPipe help?

TextPipe can be used to generate an extract from any text data source, including web sites. TextPipe can also be used to perform data cleansing or any additional processing e.g.

  • add a header record (e.g. provide column titles for .CSV files)
  • remove unwanted data
  • replace specific text
  • convert line feeds to DOS/Unix/Mac
  • expand tabs
  • fix capitalization
  • convert from EBCDIC to ASCII
  • remove multiple whitespace
  • remove columns, lines or fields
  • remove duplicate records
  • sort
  • extract email addresses from specific fields
  • discard records matching a pattern
  • and much more

Optimizing Performance

Memory
If you're sorting large files, give TextPipe as much memory as possible. Close EVERY unnecessary application.

Once TextPipe starts sorting, try not to start any new programs because TextPipe 'memory full' benchmark will be incorrect. TextPipe assumes you're going to give it as much memory as possible, and that it won't decrease, while it is performing a sort.

Disk I/O and virus scanners
The slowest operation that TextPipe performs is reading from and writing to disk. You can improve performance by making sure that all files being processed are stored on local disks rather than on network servers. You can also increase speed by an order of magnitude by using RAM drives - a disk held in memory, although naturally this won't help if the files you process are very large.

  • Disable any virus scanning while TextPipe is running.
  • TextPipe utilizes specific Windows API calls to enhance the speed of reading data files.

Temporary files
TextPipe doesn't use temporary files at all, except for sorting where they are unavoidable. TextPipe only ever writes out the completed output file so far. It uses a file name like TXPxxx.tmp until the file is completely written out, then it renames it to the actual output filename.

If you have enough memory, the entire sort is performed in memory for speed. Every 10000 lines, TextPipe checks to see if there is less than 16MB of physical memory (not virtual memory!). If so, it writes the sorted results so far to a temporary file and then continues. If the Output Filter is set to File Output or Single File Output, any temporary files are written to the same folder as the output file. If the Output Filter is set to Clipboard Output then any temporary files are written to the current folder. All temporary files are removed as soon as possible as the sort progresses.

Pattern matching
You can improve matching performance by an order of magnitude by allowing patterns to fail earlier by limiting what wildcards like .* can match. If you can replace .* (match any character 0 or more times) with [^>]* (match any character except '>' 0 or more times) or [^>]{0,200} (match any character except '>' up to 200 times) then your patterns will match/fail to match far more quickly.

Null characters - Clipboard Limitations

Clipboard Limitations
The Windows clipboard cannot be used to process text that contains null characters (ASCII code 0). This is because the clipboard contents are defined as null terminated, and operations on it may halt prematurely. Note that this is a Windows limitation – not a TextPipe limitation. Binary data should be place in a file for TextPipe to process it.

Regular expression limitations
While TextPipe's Perl pattern matcher DOES allow nulls to be search for, TextPipe's egrep implementation does not allow null characters to be searched for.

Buy TextPipe Online!

Buy TextPipe Online!

 
 
 
 
Introduction
=: TextPipe
=: Features
   
Tutorials
=: Basics
=: Change Filter List
=: Move and Delete Filters
=: Add Files
=: Restrictions and Sub-filters
   
Using TextPipe
=: Filter Lists
=: Sub filters
=: File Lists
=: Comments Tab
=: Scratch Pad Tab
=: Trial Run Tab
=: Drag and Drop
=: Status Tab
=: Results Tab
=: Errors Tab
   
Filters
=: Input Filter
=: Output Filter
=: File Menu
=: Edit Menu
   
Conversion Filter
=: Fixed Width to Delimited Wizard
=: Convert End of Lines
=: IBM Drawing Characters
=: Tabs to Spaces
=: Uppercase
=: Lowercase
=: Title Case
=: Expand Packed Decimal
=: Expand Zoned Decimal
=: Convert PDF to Text
=: HTTP Encode
=: UUDecode
=: XXEncode
=: Unscramble
=: Character Maths
   
Add Filters
=: Add Line Number
=: Left Margin
=: Right Margin
=: Header
=: Footer
=: Word Wrap
   
Advanced Filter Operations
=: Remove Filters
=: Extract Filters
=: Replace Filters
=: Restriction Filters
=: Map Filters
=: Pattern Matching
   
Special Filters
=: Database Connection
=: Sort
=: Script Filter
=: Capture Text
=: Comments
=: Advanced Options
=: Advanced Features
=: Tools Menu
=: Window Menu
=: Help Menu
   
 
 

  Home | Contact us | Resources | Add your Manuals | Remove your Manuals