This tool provides (reasonably) fast text searching through large CSV/TSV files where each line is a timestamped unit of text. The main search feature counts the number of lines a query - or queries - appear in. The tool adds a number of features for the exploration of query contexts.

Source code is available on github and there is also a documentation. Written by Bernhard Rieder, with the support of Universitat Autònoma de Barcelona.

Choose a file to work with

This tool works on files read from a data directory on the machine it runs. Since it is designed to run on (very) big files, there is currently no upload function - talk to your administrator for how to add files.

Files in data directory:

Define your analysis

Choose the file columns to use

Timestamp:
Text: add
Score: add

Search parameters

Search query:
(leave empty for no query, OR and AND, separate multiple queries with comma)
File language:
Startdate:
YYYY-MM-DD or YYYY-MM-DD HH:MM
Enddate:
YYYY-MM-DD or YYYY-MM-DD HH:MM
Time interval:
minute hour day week month year

Analysis options

show full count on top linegraph
show of score column as extra line
show word context (with words in lists and a window of words before and after; 0 = no limit) EPERIMENTAL: limit context to column:
show word tree (experimental, use with a single query only; works well with queries like [we are]; can get very big for very common words; start with a small query; )
create a summary file for the query
write filtered lines to new file (use wisely)