Search Expression Queries
Overview
The Search Engine provides an ability to compose expressions for searching posts by words and phrases they contain, as well as by some attributes they have, like feed, site, domain, country, language, topic, category, publication or fetch date. Expression consist from operations classified as basic, logical and restrictive. Basic operations return lists of posts by specific keys such as word, source or site. The set of basic operations is derived from data structure of the system. Logical operations allows to combine expressions uniting, intersecting search results or excluding one result from another. Restrictive operations narrow down result set filtering posts by specific criteria.
Basic operations
ALL, ANY, CATEGORY, EVERY, SITE, URL basic operations intended for searching posts by words they contain or by source location news posts come from.
Operation ALL have the ALL modifier ... term ... form where term is a word or a phrase taken into double or single quotes. Operation return just those posts which contain all specified terms. Presence of optional modifier CORE mean that only post "important" words should be considered. Positional modifiers TITLE, DESCRIPTION, IMAGE-ALT, KEYWORDS define in which post parts found terms should be present: TITLE - in title, DESCRIPTION - in annotation, IMAGE-ALT - in explanatory text for an image, KEYWORDS - in automatically created invisible list of words characterizing a post. If no positioning operators provided then TITLE, DESCRIPTION and IMAGE-ALT assumed to be present implicitly. Operation name declaration is optional. If enter only search terms or start with just modifiers then Search Engine will "guess" that word ALL is assumed at the beginning. Search is performed with no respect to character case, this why fly Moon and fly moon searches give the same result. Search is performed with no respect to word forms as well, so tournament winner and tournaments winners searches are identical. When you search for phrases, Search Engine will display only those posts which have no punctuation marks between words inside a phrase being searched. There is a difference between taking search phrases into single or double quotes. When phrase is double quoted then words should appear in posts in exactly same form to be found. When phrase is single quoted then word form is not taken into account. So 'space shuttle' will find all 3 phrases:
... the space shuttle Atlantis ... all space shuttles ... space shuttle's fuel tank ...
while "space shuttle" will find only the first phrase. Quoted phrase will match to post in presense of CORE modifier if at least one word contained in phrase belong to set of post "important" words. Usage examples:
asteroid
'space shuttle'
TITLE "solar system" research
ALL solar system
ALL CORE rocket 'launch failed'
ALL TITLE DESCRIPTION rocket
Operation ANY have the ANY (threshold)? modifier ... term ... form and differs from ALL with one thing - it find posts containing not all but at least some of listed terms in quantity not less than threshold specified (1 by default). Usage examples:
ANY meteorite comet
ANY 2 astronaut "space shuttle" "progress M-60"
ANY 2 CORE moon mars jupiter saturn
ANY 2 CORE TITLE KEYWORDS moon mars jupiter saturn
Operation CATEGORY have the CATEGORY category path... form and returns posts attributed to any of categories specified. Leading and trailing slashes in category path are optional. If category path contains space character you need to take the path into quotes otherwise it will be interpreted as 2 separate paths. Usage examples:
CATEGORY /Science/Astronomy Science/Biology/
CATEGORY "/Sport/Model Aircraft" Science
Operation EVERY return posts from all sources. Usage examples:
EVERY
EVERY DOMAIN ru
Operation SITE have the SITE source site... form and return posts from sources belonging to sites specified. Usage examples:
SITE news.yahoo.com prw.com
SITE news.yandex.ru
Operation URL have the URL source url... form and return posts from sources specified. Usage examples:
URL http://n.y.com/z
URL y.ru/a y.ru/b
Logical operations
Besides basic operations there are logical operations AND, EXCEPT, OR which allow to specify more complex searches - so called search expressions. Strictly saying basic operations are search expressions as well, though trivial ones.
Operation AND have the search expression AND search expression form and return posts which have been found for left and right search expressions simultaneously. Usage examples:
ANY meteorite comet AND SITE msn.com
rocket AND ANY space canaveral
Operation EXCEPT have the search expression EXCEPT search expression form and return just those posts which have been found for left search expression but not for the right one. Usage examples:
virus EXCEPT ANY computer phone program
SITE c.com EXCEPT URL http://c.com/z
Operation OR have the search expression OR search expression form and return posts which have been found for left or right search expression. Usage examples:
SITE ya.com OR URL pr.com/x
space research OR ANY astronaut spaceship "space shield"
Search expressions containing not only base operations are called compound. They get evaluated in the order of operation priority decrease and from the left to the right for operations with the same priority. Evaluation order can be changed by means of parenthesis. AND and EXCEPT have equal priority, which is higher than that of OR operation.
This why
comet AND SITE rss.msnbc.msn.com OR rocket AND ANY space canaveral
is equivalent to
( comet AND SITE rss.msnbc.msn.com ) OR ( rocket AND ANY space canaveral ).
Restrictive operations
Besides basic and logical operations Search Engine provides set of restrictive operations: CAPACITY, COUNTRY, DATE, DOMAIN, FETCHED, LANGUAGE, VISITED, WITH. Restricting operations have equal priority which is lower than the one of all other operations.
Operation CAPACITY have 2 forms: search expression CAPACITY numberā€¦ and search expression CAPACITY number.... First form return only those posts which where found for the search expression on the left and which events unite NOT less posts than number specified. The second - those which were found for expression on the left and which events unite less posts than number specified. Usage examples:
EVERY CAPACITY 10
medicine CAPACITY NOT 10
Operation COUNTRY have 2 forms: search expression COUNTRY country code... and search expression COUNTRY NOT country code.... First form return only those posts which where found for the search expression on the left and published with a source from one of countries listed on the right. The second - those which were found for expression on the left and published by a source not from any of countries listed on the right. Usage examples:
ipod OR itunes COUNTRY USA
satellite launch COUNTRY NOT USA GBR
Operation DATE have the 2 forms: search expression DATE time and search expression DATE BEFORE time. First form return only those posts which were found for the search expression on the left and published not earlier than specified time. The second - those which were found for expression on the left and published earlier than specified time. Time should be expressed in any of forms described below. Usage examples:
president bush DATE 2H
( canaveral DATE BEFORE 2D ) OR ( astronaut COUNTRY USA DATE 3D )
Operation DOMAIN also have 2 forms: search expression DOMAIN internet-domain... and search expression DOMAIN NOT internet-domain.... First form return only those posts which were found for the search expression on the left and published with a source in one of listed on the right domains. The second - those which were found for expression on the left and published by a source not belonging to any of listed on the right domains. Usage examples:
ANY mac apple DOMAIN ru by
ANY linux DOMAIN NOT microsoft.com
Operation FETCHED have the 2 forms: search expression FETCHED time and search expression FETCHED BEFORE time. First form return only those posts which were found for the search expression on the left and fetched from a source not earlier than specified time. The second - those which were found for expression on the left and fetched earlier than specified time. Time should be expressed in any of forms descibed below. Usage examples:
president bush FETCHED 50M
( nasa FETCHED BEFORE 500S ) OR ( hubble COUNTRY USA FETCHED 3H )
Operation LANGUAGE have 2 forms: search expression LANGUAGE language code... and search expression LANGUAGE NOT language code.... First form return only those posts which where found for the search expression on the left and written in one of listed on the right languages. The second - those which were found for expression on the left and written in language different from any of listed on the right. Usage examples:
ipod OR itunes LANGUAGE eng
satellite launch COUNTRY USA LANGUAGE NOT eng rus
Operation VISITED have the 2 forms: search expression VISITED time and search expression VISITED BEFORE time. First form return only those posts which were found for the search expression on the left and opened for the last time not earlier than time specified. The second - those which were found for expression on the left and opened not earlier than specified time or were not opened at all. Time should be expressed in any of forms descibed below. Post considered to be opened in any of the following cases:
  • user followed a link to full article page;
  • user followed a link to an image;
  • post belongs to an event opened by following a link "Whole story".
  • Usage examples:
    EVERY VISITED 7D
    Operation WITH have 2 forms: search expression WITH feature... and search expression WITH NO featureā€¦. First form return only those posts which where found for the search expression on the left and has all features specified. The second - ones which were found for expression on the left and has none of features specified. The only available feature at the moment is IMAGE. Usage examples:
    ipod OR itunes LANGUAGE eng WITH IMAGE
    EVERY WITH NO IMAGE
    Time
    When using operations DATE, FETCHED and VISITED you should express the time in one of the following forms:
  • YYYY-MM-DD.hh:mm:ss. UTC time is assumed. Example:
  • FETCHED 2009-03-25.17:45:07
  • number unit, where unit takes one of the following values: D (days), H (hours), M (minutes), S (seconds). This form points to a moment in the past distant from now by specified number of time units. Example:
  • FETCHED 23H (23 hours ago)
  • number of seconds since Epoch (00:00:00 UTC, January 1, 1970). Example:
  • FETCHED 1239693207
    ALL · ANY · CATEGORY · EVERY · SITE · URL
    CAPACITY · COUNTRY · DATE · DOMAIN · FETCHED · LANGUAGE · VISITED · WITH