$Revision: 6121 $
Copyright 2010-2020 Fred Toussi. Permission is granted to distribute this document without any alteration under the terms of the HSQLDB license. Additional permission is granted to the HSQL Development Group to distribute this document with or without alterations under the terms of the HSQLDB license.
2020-06-29
Table of Contents
HyperSQL data access and data change statements are compatible with the latest SQL:2016 Standard. There are a few extensions and some relaxation of rules, but these do not affect statements that are written to the Standard syntax. There is full support for classic SQL, as specified by SQL-92, and many enhancements added in later versions of the standard.
An SQL statement can be executed in two ways. One way is to use the
    java.sql.Statement interface. The Statement object
    can be reused to execute completely different SQL statements.
    Alternatively, a PreparedStatment can be used to
    execute an SQL statement repeatedly, and the statements can use
    parameters. Using either form, if the SQL statement is a query expression,
    a ResultSet is returned.
In SQL, when a query expression (SELECT or similar SQL statement) is
    executed, an ephemeral table is created. When this table is returned to
    the application program, it is returned as a result set, which is accessed
    row by row by a cursor. A JDBC ResultSet represents
    an SQL result set and its cursor.
The minimal definition of a cursor is a list of rows with a position that can move forward. Some cursors also allow the position to move backwards or jump to any position in the list.
An SQL cursor has several attributes. These attributes depend on the
    query expression. Some of these attributes can be overridden by specifying
    qualifiers in the SQL statement or by specifying values for the parameters
    of the JDBC Statement or
    PreparedStatement.
The columns of the result set are determined by the query
      expression. The number of columns and the type and name characteristics
      of each column are known when the query expression is compiled and
      before its execution. This metadata information remains constant
      regardless of changes to the contents of the tables used in the query
      expression. The metadata for the JDBC ResultSet
      is in the form of a ResultSetMetaData object.
      Various methods of the ResultSetMetaData
      interface return different properties of each column of the
      ResultSet.
A result set may contain 0 or more rows. The rows are determined by the execution of the query expression.
The setMaxRows(int) method of JDBC
      Statement allows limiting the number of rows
      returned by the statement. This limit is conceptually applied after the
      result has been built, and the excess rows are discarded.
A cursor is either scrollable or not. Scrollable cursors allow accessing rows by absolute or relative positioning. No-scroll cursors only allow moving to the next row. The cursor can be optionally declared with the SQL qualifiers SCROLL, or NO SCROLL. The JDBC statement parameter can be specified as: TYPE_FORWARD_ONLY and TYPE_SCROLL_INSENSITIVE. The JDBC type TYPE_SCROLL_SENSITIVE is not supported by HSQLDB.
The default is NO SCROLL or TYPE_FORWARD_ONLY.
When a JDBC ResultSet is opened, it is
      positioned before the first row. Using the
      next() method, the position is moved to the
      first row. While the ResultSet is positioned on a
      row, various getter methods can be used to access the columns of the
      row.
The result returned by some query expressions is updatable. HSQLDB supports core SQL updatability features, plus some enhancements from the SQL optional features.
A query expression is updatable if it is a SELECT from a single underlying base table (or updatable view) either directly or indirectly. A SELECT statement featuring DISTINCT or GROUP BY or FETCH, LIMIT, OFFSET is not updatable. In an updatable query expression, one or more columns are updatable. An updatable column is a column that can be traced directly to the underlying table. Therefore, columns that contain expressions are not updatable. Examples of updatable query expressions are given below. The view V is updatable when its query expression is updatable. The SELECT statement from this view is also updatable:
SELECT A, B FROM T WHERE C > 5 SELECT A, B FROM (SELECT * FROM T WHERE C > 10) AS TT WHERE TT.B <10 CREATE VIEW V(X,Y) AS SELECT A, B FROM T WHERE C > 0 AND B < 10 SELECT X FROM V WHERE Y = 5
If a cursor is declared with the SQL qualifier, FOR
      UPDATE OF <column name list>, then only the stated
      columns in the result set become updatable. If any of the stated columns
      is not actually updatable, then the cursor declaration will not
      succeed.
If the SQL qualifier, FOR UPDATE is used, then all the updatable columns of the result set become updatable.
If a cursor is declared with FOR READ ONLY, then it is not updatable.
In HyperSQL, if FOR READ ONLY or FOR UPDATE is not used then all the updatable columns of the result set become updatable. This relaxes the SQL standard rule that in this case limits updatability to only simply updatable SELECT statements (where all columns are updatable).
In JDBC, CONCUR_READ_ONLY or CONCUR_UPDATABLE can be specified for
      the Statement parameter. CONCUR_UPDATABLE is
      required if the returning ResultSet is to be updatable. If
      CONCUR_READ_ONLY, which is the default, is used, then even an updatable
      ResultSet becomes read-only.
When a ResultSet is updatable, various
      setter methods can be used to modify the column values. The names of the
      setter methods begin with "update". After all the updates on a row are
      done, the updateRow() method must be called
      only once to finalise the row update.
An updatable ResultSet may or may not be
      insertable-into. In an insertable ResultSet, all
      columns of the result are updatable and any column of the base table
      that is not in the result must be a generated column or have a default
      value.
In the ResultSet object, a special
      pseudo-row, called the insert row, is used to populate values for
      insertion into the ResultSet (and consequently,
      into the base table). The setter methods must be used on all the
      columns, followed by a call to
      insertRow().
Individual rows from all updatable result sets can be deleted one
      at a time. The deleteRow() is called when the
      ResultSet is positioned on a row.
While using an updatable ResultSet to modify data, it is recommended not to change the same data using another ResultSet and not to execute SQL data change statements that modify the same data.
The sensitivity of the cursor relates to visibility of changes made to the data by the same transaction but without using the given cursor. While the result set is open, the same transaction may use statements such as INSERT or UPDATE, and change the data of the tables from which the result set data is derived. A cursor is SENSITIVE if it reflects those changes. It is INSENSITIVE if it ignores such changes. It is ASENSITIVE if behaviour is implementation dependent.
The SQL default is ASENSITIVE, i.e., implantation dependent.
In HyperSQL all cursors are INSENSITIVE. They do not reflect changes to the data made by other statements.
A cursor is holdable if the result set is not automatically closed when the current transaction is committed. Holdability can be specified in the cursor declaration using the SQL qualifiers WITH HOLD or WITHOUT HOLD.
In JDBC, holdability is specified using either of the following values for the Statement parameter: HOLD_CURSORS_OVER_COMMIT, or CLOSE_CURSORS_AT_COMMIT.
The SQL default is WITHOUT HOLD.
The JDBC default for HyperSQL result sets is WITH HOLD for read-only result sets and WITHOUT HOLD for updatable result sets.
If the holdability of a ResultSet is
      specified in a conflicting manner in the SQL statement and the JDBC
      Statement object, the JDBC setting takes
      precedence.
The autocommit property of a connection is a feature of JDBC and
      ODBC and is not part of the SQL Standard. In autocommit mode, all
      transactional statements are followed by an implicit commit. In
      autocommit mode, all ResultSet objects are
      read-only and holdable.
The JDBC settings, ResultSet.CONCUR_READONLY and ResultSet.CONCUR_UPDATABLE are the available alternatives for read-only or updatability. The default is ResultSet.CONCUR_READONLY.
The JDBC settings, ResultSet.TYPE_FORWARD_ONLY, ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.TYPE_SCROLL_SENSITIVE are the available alternatives for both scrollability (navigation) and sensitivity. HyperSQL does not support ResultSet.TYPE_SCROLL_SENSITIVE. The two other alternatives can be used for both updatable and read-only result sets.
The JDBC settings ResultSet.CLOSE_CURSORS_AT_COMMIT and ResultSet.HOLD_CURSORS_OVER_COMMIT are the alternatives for the lifetime of the result set. The default is ResultSet.CLOSE_CURSORS_AT_COMMIT. The other setting can only be used for read-only result sets.
Examples of creating statements for updatable result sets are given below:
Connection c = newConnection(); Statement st; c.setAutoCommit(false); st = c.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_UPDATABLE); st = c.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);
When a JDBC PreparedStatement or CallableStatement is used with an SQL statement that contains dynamic parameters, the data types of the parameters are resolved and determined by the engine when the statement is prepared. The SQL Standard has detailed rules to determine the data types and imposes limits on the maximum length or precision of the parameter. HyperSQL applies the standard rules with two exceptions for parameters with String and BigDecimal Java types. HyperSQL ignores the limits when the parameter value is set, and only enforces the necessary limits when the PreparedStatement is executed. In all other cases, parameter type limits are checked and enforced when the parameter is set.
In the example below the setString()
      calls do not raise an exception, but one of the
      execute() statements does.
 // table definition: CREATE TABLE T (NAME VARCHAR(12), ...)
 Connection c = newConnection();
 PreparedStatement st = c.prepareStatement("SELECT * FROM T WHERE NAME = ?");
 // type of the parameter is VARCHAR(12), which limits length to 12 characters
 st.setString(1, "Eyjafjallajokull"); // string is longer than type, but no exception is raised here
 set.execute(); // executes with no exception and does not find any rows
 // but if an UPDATE is attempted, an exception is raised
 st = c.prepareStatement("UPDATE T SET NAME = ? WHERE ID = 10");
 st.setString(1, "Eyjafjallajokull"); // string is longer than type, but no exception is raised here
 st.execute(); // exception is thrown when HyperSQL checks the value for update
JDBC parameters can be set with any compatible type, as supported
      by the JDBC specification. For CLOB and BLOB types, you can use streams,
      or create instances of BLOB or CLOB before assigning them to the
      parameters. You can even use CLOB or BLOB objects returned from
      connections to other RDBMS servers. The
      Connection.createBlob() and
      createClob() methods can be used to create the
      new LOBs. For very large LOB's the stream methods are preferable as they
      use less memory.
For array parameters, you can use a
      java.sql.Array object that contains the array
      elements before assigning to JDBC parameters. The
      Connection.createArrayOf(...) method can be used to create
      a new object, or you can use an Array returned from connections to other
      RDBMS servers. You can also use Java arrays of primitives to assign to
      the SQL array parameters.
The above also applies to the methods of
      java.sql.ResultSet that are used for setting
      values in new and updated rows in updatable ResultSet objects.
When the Java 8 version of HyperSQL jar is used, the setObject()
      methods of PreparedStatement and
      CallableStatement also accept objects of the new
      types introduced in Java 8 and listed below under JDBC Returned Values.
      The new Java 8 method, getObject(int columnIndex,
      Class<T> type), can be used to retrieve the value of
      an OUT parameter from a CallableStatement.
Data change statements, also called data manipulation statements
      (DML) such as INSERT, UPDATE, MERGE can be called with different
      executeUpdate() methods of java.sql.Statement
      and java.sql.PreparedStatement. Some of these
      methods allow you to specify how values for generated columns of the
      table are returned. These methods are documented in the JavaDoc for
      org.hsqldb.jdbc.JDBCStatement and
      org.hsqldb.jdbc.JDBCPreparedStatement. HyperSQL
      can return not just the generated columns, but any set of columns of the
      table. You can use this to retrieve the columns values that may be
      modified by a BEFORE TRIGGER on the table.
The JDBC CallableStatement interface is used to call Java or SQL procedures that have been defined in the database. The SQL statement is in the form of CALL procedureName ( ... ) with constant value arguments or with parameter markers. Note that you must use a parameter marker for OUT and INOUT arguments of the procedure you are calling. The OUT arguments should not be set before executing the callable statement.
After executing the statement, you can retrieve the OUT and INOUT parameters with the appropriate getXXX() method.
Procedures can also return one or more result sets. You should
      call the getResultSet() and
      getMoreResults() methods to retrieve the result
      sets one by one.
SQL functions can also return a table. You can call such functions the same way as procedures and retrieve the table as a ResultSet.
The methods of the JDBC ResultSet interface are used to return values and to convert value to different types as supported by the JDBC specification. Methods of JDBC CallableStatement that have the same signature are used to return values from procedure calls.
When a CLOB and BLOB object is returned from a ResultSet, no data
      is transferred until the data is read by various methods of
      java.sql.CLOB and
      java.sql.BLOB. Data is streamed in large blocks
      to avoid excessive memory use.
Array objects are returned as instances of
      java.sql.Array.
When the Java 8 version of HyperSQL jar is used, the methods added
      in Java 8 become available and the behaviour of the
      getObject(int columnIndex) method for
      retrieving TIMESTAMP WITH TIME ZONE values changes. With Java 7 and
      older, this method returns a java.sql.Timestamp
      object. With Java 8 it returns a
      java.time.OffsetDateTime object which contains
      the time zone offset value as well as the other fields of the
      TIMESTAMP.
A new method, getObject(int columnIndex,
      Class<T> type) is available in Java 8. With this
      method, you specify the required return type. The common types such as
      String, Integer, byte[] are supported, as well as new types that can be
      used for DATE, TIME, TIMESTAMP, and INTERVAL values. The table below
      shows which Java classes are the most appropriate for the specified SQL
      TYPES. In addition, you can use these Java types for values of SQL type
      that are not fully matched. For example,
      java.time.LocalDateTime can be used to retrieve a
      DATE value.
| java.util.UUID= UUID | 
| java.time.LocalDate= DATE | 
| java.sql.Date= DATE | 
| java.time.LocalTime= TIME | 
| java.sql.Time= TIME | 
| java.time.LocalDateTime=
              TIMESTAMP | 
| java.sql.Timestamp=
              TIMESTAMP | 
| java.time.OffsetTime= TIME WITH
              TIME ZONE | 
| java.time.OffsetDateTime=
              TIMESTAMP WITH TIME ZONE | 
| java.time.Duration= INTERVAL
              MONTH, INTERVAL YEAR, INTERVAL YEAR TO MONTH | 
| java.time.Period= INTERVAL
              SECOND, INTERVAL MINUTE, INTERVAL HOUR, INTERVAL DAY, and their
              range combinations | 
| java.sql.Array= all ARRAY | 
The DECLARE CURSOR statement is used within an SQL PROCEDURE body. In the current version of HyperSQL, the cursor is used only to return a result set from the procedure. Therefore, the cursor must be declared WITH RETURN and can only be READ ONLY.
DECLARE CURSOR
declare cursor statement
<declare cursor> ::= DECLARE <cursor
      name>
[ { SENSITIVE | INSENSITIVE | ASENSITIVE } ] [ { SCROLL |
      NO SCROLL } ] 
CURSOR [ { WITH HOLD | WITHOUT HOLD } ] [ { WITH RETURN |
      WITHOUT RETURN } ]
FOR <query expression>
[ FOR { READ ONLY | UPDATE [ OF <column name list>
      ] } ]
The query expression is a SELECT statement or similar, and is
      discussed in the rest of this chapter. In the example below a cursor is
      declared for a SELECT statement. It is later opened to create the result
      set. The cursor is specified WITHOUT HOLD, so the result set is not kept
      after a commit. Use WITH HOLD to keep the result set. Note that you need
      to declare the cursor WITH RETURN as it is returned by the
      CallableStatement.
DECLARE thiscursor SCROLL CURSOR WITHOUT HOLD WITH RETURN FOR SELECT * FROM INFORMATION_SCHEMA.TABLES; -- OPEN thiscursor;
The syntax elements that can be used in data access and data change statements are described in this section. The SQL Standard has a very extensive set of definitions for these elements. The BNF definitions given here are sometimes simplified.
Literals are used to express constant values. The general type of a literal is known by its format. The specific type is based on conventions.
unicode escape elements
unicode escape elements
<Unicode escape specifier> ::= [ UESCAPE
      <quote><Unicode escape character><quote> ]
      
<Unicode escape value> ::= <Unicode 4 digit
      escape value> | <Unicode 6 digit escape value> | <Unicode
      character escape value>
<Unicode 4 digit escape value> ::= <Unicode
      escape
      character><hexit><hexit><hexit><hexit>
<Unicode 6 digit escape value> ::= <Unicode
      escape character><plus sign>
      <hexit><hexit><hexit><hexit><hexit><hexit>
<Unicode character escape value> ::= <Unicode
      escape character><Unicode escape character>
<Unicode escape character> ::= a single
      character other than a <hexit> (a-f, A-F, 0-9), <plus sign>,
      <quote>, <double quote>, or <white
      space>
character literal
character literal
<character string literal> ::= [
      <introducer><character set specification> ] <quote> [
      <character representation>... ] <quote> [ {
      <separator> <quote> [ <character representation>... ]
      <quote> }... ]
<introducer> ::=
      <underscore>
<character representation> ::= <nonquote
      character> | <quote symbol>
<nonquote character> ::= any character apart
      from the quote symbol.
<quote symbol> ::=
      <quote><quote>
<national character string literal> ::= N
      <quote> [ <character representation>... ] <quote> [ {
      <separator> <quote> [ <character representation>... ]
      <quote> }... ]
<Unicode character string literal> ::= [
      <introducer><character set specification> ]
      U<ampersand><quote> [ <Unicode representation>... ]
      <quote> [ { <separator> <quote> [ <Unicode
      representation>... ] <quote> }... ] <Unicode escape
      specifier>
<Unicode representation> ::= <character
      representation> | <Unicode escape value>
The type of a character literal is CHARACTER. The length of the string literal is the character length of the type. If the quote character is used in a string, it is represented with two quote characters. Long literals can be divided into multiple quoted strings, separated with a space or end-of-line character.
Unicode literals start with U& and can contain ordinary characters and Unicode escapes. A Unicode escape begins with the backslash ( \ ) character and is followed by four hexadecimal characters which specify the character code. The Unicode escape character can be custom defined for a literal string by adding UESPACE as in one of the examples below.
Example of character literals are given below:
'a literal' ' string seperated' ' into parts' 'a string''s literal form with quote character' U&'Unicode string with Greek delta \0394 and phi \03a6 letters' U&'Unicode string with forward slash // as custom escape character'UESCAPE'/'
binary literal
binary literal
<binary string literal> ::= X <quote> [
      <space>... ] [ { <hexit> [ <space>... ] <hexit>
      [ <space>... ] }... ] <quote> [ { <separator>
      <quote> [ <space>... ] [ { <hexit> [ <space>...
      ] <hexit> [ <space>... ] }... ] <quote> }...
      ]
<hexit> ::= <digit> | A | B | C | D | E |
      F | a | b | c | d | e | f
The type of a binary literal is BINARY. The octet length of the binary literal is the length of the type. Case-insensitive hexadecimal characters are used in the binary string. Each pair of characters in the literal represents a byte in the binary string. Long literals can be divided into multiple quoted strings, separated with a space or end-of-line character.
X'1abACD34' 'Af'
bit literal
bit literal
<bit string literal> ::= B <quote> [
      <bit> ... ] <quote> [ { <separator> <quote> [
      <bit>... ] <quote> }... ]
<bit> ::= 0 | 1
The type of a binary literal is BIT. The bit length of the bit literal is the length of the type. Digits 0 and 1 are used to represent the bits. Long literals can be divided into multiple quoted strings, separated with a space or end-of-line character.
B'10001001' '00010'
numeric literal
numeric literal
<signed numeric literal> ::= [ <sign> ]
      <unsigned numeric literal>
<unsigned numeric literal> ::= <exact numeric
      literal> | <approximate numeric literal>
<exact numeric literal> ::= <unsigned
      integer> [ <period> [ <unsigned integer> ] ] |
      <period> <unsigned integer>
<sign> ::= <plus sign> | <minus
      sign>
<approximate numeric literal> ::=
      <mantissa> E <exponent>
<mantissa> ::= <exact numeric
      literal>
<exponent> ::= <signed
      integer>
<signed integer> ::= [ <sign> ]
      <unsigned integer>
<unsigned integer> ::=
      <digit>...
The type of an exact numeric literal without a decimal point is INTEGER, BIGINT, or DECIMAL, depending on the value of the literal (the smallest type that can represent the value is the type).
The type of an exact numeric literal with a decimal point is DECIMAL. The precision of a decimal literal is the total number of digits of the literal. The scale of the literal is the total number of digits to the right of the decimal point.
The type of an approximate numeric literal is DOUBLE. An approximate numeric literal always includes the mantissa and exponent, separated by E.
12 34.35 +12E-2
boolean literal
boolean literal
<boolean literal> ::= TRUE | FALSE |
      UNKNOWN
The boolean literal is one of the specified keywords.
datetime and interval literal
datetime and interval literal
<datetime literal> ::= <date literal> |
      <time literal> | <timestamp literal>
<date literal> ::= DATE <date
      string>
<time literal> ::= TIME <time
      string>
<timestamp literal> ::= TIMESTAMP <timestamp
      string>
<date string> ::= <quote> <unquoted
      date string> <quote>
<time string> ::= <quote> <unquoted
      time string> <quote>
<timestamp string> ::= <quote>
      <unquoted timestamp string> <quote>
<time zone interval> ::= <sign> <hours
      value> <colon> <minutes value>
<date value> ::= <years value> <minus
      sign> <months value> <minus sign> <days
      value>
<time value> ::= <hours value>
      <colon> <minutes value> <colon> <seconds
      value>
<interval literal> ::= INTERVAL [ <sign> ]
      <interval string> <interval qualifier>
<interval string> ::= <quote> <unquoted
      interval string> <quote>
<unquoted date string> ::= <date
      value>
<unquoted time string> ::= <time value> [
      <time zone interval> ]
<unquoted timestamp string> ::= <unquoted
      date string> <space> <unquoted time
      string>
<unquoted interval string> ::= [ <sign> ]
      { <year-month literal> | <day-time literal>
      }
<year-month literal> ::= <years value> [
      <minus sign> <months value> ] | <months
      value>
<day-time literal> ::= <day-time interval>
      | <time interval>
<day-time interval> ::= <days value> [
      <space> <hours value> [ <colon> <minutes value>
      [ <colon> <seconds value> ] ] ]
<time interval> ::= <hours value> [
      <colon> <minutes value> [ <colon> <seconds
      value> ] ] | <minutes value> [ <colon> <seconds
      value> ] | <seconds value>
<years value> ::= <datetime
      value>
<months value> ::= <datetime
      value>
<days value> ::= <datetime
      value>
<hours value> ::= <datetime
      value>
<minutes value> ::= <datetime
      value>
<seconds value> ::= <seconds integer
      value> [ <period> [ <seconds fraction> ]
      ]
<seconds integer value> ::= <unsigned
      integer>
<seconds fraction> ::= <unsigned
      integer>
<datetime value> ::= <unsigned
      integer>
The type of a datetime or interval type is specified in the literal. The fractional second precision is the number of digits in the fractional part of the literal. Details are described in the SQL Language chapter
DATE '2008-08-08' TIME '20:08:08' TIMESTAMP '2008-08-08 20:08:08.235' INTERVAL '10' DAY INTERVAL -'08:08' MINUTE TO SECOND
References are identifier chains, which can be a single identifiers or identifiers chains composed of single identifiers chained together with the period symbol.
identifier chain
identifier chain
<identifier chain> ::= <identifier> [ {
      <period> <identifier> }... ]
<basic identifier chain> ::= <identifier
      chain>
A period-separated chain of identifiers. The identifiers in an identifier chain can refer to database objects in a hierarchy. The possible hierarchies are as follows. In each hierarchy, elements from the start or the end can be missing, but the order of elements cannot be changed.
catalog, schema, database object
catalog, schema, table, column
correlation name, column
Examples of identifier chain are given below:
SELECT MYCATALOG.MYSCHEMA.MYTABLE.MYCOL FROM MYCATALOG.MYSCHEMA.MYTABLE DROP TABLE MYCATALOG.MYSCHEMA.MYTABLE CASCADE ALTER SEQUENCE MYCATALOG.MYSCHEMA.MYSEQUENCE RESTART WITH 100
column reference
column reference
<column reference> ::= <basic identifier
      chain> | MODULE <period> <qualified identifier>
      <period> <column name>
Reference a column or a routine variable.
SQL parameter reference
SQL parameter reference
<SQL parameter reference> ::= <basic
      identifier chain>
Reference an SQL routine parameter.
contextually typed value specification
contextually typed value specification
<contextually typed value specification> ::=
      <null specification> | <default
      specification>
<null specification> ::=
      NULL
<default specification> ::=
      DEFAULT
Specify a value whose data type or value is inferred from its context.
DEFAULT is used for assignments to table columns that have a default value, or to table columns that are generated either as an IDENTITY value or as an expression.
NULL can be used only in a context where the type of the value is known. For example, a NULL can be assigned to a column of the table in an INSERT or UPDATE statement, because the type of the column is known. But if NULL is used in a SELECT list, it must be used in a CAST statement.
Value expression is a general name for all expressions that return a value. Different types of expressions are allowed in different contexts.
value expression primary
value expression primary
<value expression primary> ::= <parenthesized
      value expression> | <nonparenthesized value expression
      primary>
<parenthesized value expression> ::= <left
      paren> <value expression> <right
      paren>
<nonparenthesized value expression primary> ::=
      <unsigned value specification> | <column reference> |
      <set function specification> | <scalar subquery> | <case
      expression> | <cast specification> | <next value
      expression> | <current value expression> | <routine
      invocation>
Specify a value that is syntactically self-delimited.
value specification
value specification
<value specification> ::= <literal> |
      <general value specification>
<unsigned value specification> ::= <unsigned
      literal> | <general value specification>
<target specification> ::= <host parameter
      specification> | <SQL parameter reference> | <column
      reference> | <dynamic parameter
      specification>
<simple target specification> ::= <host
      parameter specification> | <SQL parameter reference> |
      <column reference> | <embedded variable
      name>
<host parameter specification> ::= <host
      parameter name> [ <indicator parameter> ]
<dynamic parameter specification> ::=
      <question mark>
Specify one or more values, host parameters, SQL parameters, dynamic parameters, or host variables.
row value expression
row value expression
<row value expression> ::= <row value special
      case> | <explicit row value constructor> 
<row value predicand> ::= <row value special
      case> | <row value constructor predicand>
<row value special case> ::=
      <nonparenthesized value expression primary>
<explicit row value constructor> ::= <left
      paren> <row value constructor element> <comma> <row
      value constructor element list> <right paren>
      |
 ROW <left paren> <row value constructor
      element list> <right paren> | <row
      subquery>
Specify a row consisting of one or more elements. A comma separated list of expressions, enclosed in brackets, with the optional keyword ROW. In SQL, a row containing a single element can often be used where a single value is expected.
set function specification
set function specification
<set function specification> ::= <aggregate
      function> | <grouping operation>
<grouping operation> ::= GROUPING <left
      paren> <column reference> [ { <comma> <column
      reference> }... ] <right paren>
Specify an integer value formed by bits denoting the presence
      of the column in the current row of the result of GROUPING SETS.
      HyperSQL supports <grouping operation> from
      version 2.5.1.
COALESCE
coalesce expression
<coalesce expression> := COALESCE <left
      paren> <value expression> { <comma> <value
      expression> }... <right paren>
Replace null values with another value. The coalesce expression
      has two or more instances of <value expression>. If the first
      <value expression> evaluates to a non-null value, it is returned
      as the result of the coalesce expression. If it is null, the next
      <value expression> is evaluated and if it
      evaluates to a non-non value, it is returned, and so on.
The type of the return value of a COALESCE expression is the
      aggregate type of the types of all the <value
      expression> instances. Therefore, any value returned is
      implicitly cast to this type. HyperSQL also features built-in functions
      with similar functionality.
NULLIF
nullif expression
<nullif expression> := NULLIF <left paren>
      <value expression> <comma> <value expression>
      <right paren>
Return NULL if two values are equal. If the result of the first
      <value expression> is not equal to the result
      of the second, then it is returned, otherwise NULL is returned. The type
      of the return value is the type of the first <value
      expression>.
SELECT i, NULLIF(n, 'not defined') FROM t
CASE
case specification
<case specification> ::= <simple case> |
      <searched case>
<simple case> ::= CASE <case operand>
      <simple when clause>... [ <else clause> ]
      END
<searched case> ::= CASE <searched when
      clause>... [ <else clause> ] END
<simple when clause> ::= WHEN <when operand
      list> THEN <result>
<searched when clause> ::= WHEN <search
      condition> THEN <result>
<else clause> ::= ELSE
      <result>
<case operand> ::= <row value predicand> |
      <overlaps predicate part 1>
<when operand list> ::= <when operand> [ {
      <comma> <when operand> }... ]
<when operand> ::= <row value predicand> |
      <comparison predicate part 2> | <between predicate part 2> |
      <in predicate part 2> | <character like predicate part 2> |
      <octet like predicate part 2> | <similar predicate part 2> |
      <regex like predicate part 2> | <null predicate part 2> |
      <quantified comparison predicate part 2> | <match predicate
      part 2> | <overlaps predicate part 2> | <distinct predicate
      part 2>
<result> ::= <result expression> |
      NULL
<result expression> ::= <value
      expression>
Specify a conditional value. The result of a case expression is always a value. All the values introduced with THEN must be of the same type or convertible to the same type. The WHEN clause of CASE can be used in two different forms. The first form starts with a variable and the WHEN clauses follow, either as possible values for the variable, or as conditions. The second form does not start with a variable and each WHEN is followed by a self-contained conditional expression which can use any variables.
Some simple examples of the CASE expression are given below. The first two examples return 'Britain', 'Germany', or 'Other country' depending on the value of dial code. The third example uses IN and smaller-than predicates.
CASE dialcode WHEN 44 THEN 'Britain' WHEN 49 THEN 'Germany' ELSE 'Other country' END CASE WHEN dialcode=44 THEN 'Britain' WHEN dialcode=49 THEN 'Germany' WHEN dialcode < 0 THEN 'bad dial code' ELSE 'Other country' END CASE dialcode WHEN IN (44,49,30) THEN 'Europe' WHEN IN (86,91,92) THEN 'Asia' WHEN < 0 THEN 'bad dial code' ELSE 'Other continent' END
The case statement can be far more complex and involve several conditions.
CAST
cast specification
<cast specification> ::= CAST <left paren>
      <cast operand> AS <cast target> <right
      paren>
<cast operand> ::= <value expression> |
      <implicitly typed value specification>
<cast target> ::= <domain name> | <data
      type>
Specify a data conversion. Data conversion takes place automatically among variants of a general type. For example, numeric values are freely converted from one type to another in expressions.
Explicit type conversion is necessary in two cases. One case is to determine the type of a NULL value. The other case is to force conversion for special purposes. Values of data types can be cast to a character type. The exception is BINARY and OTHER types. The result of the cast is the literal expression of the value. Conversely, a value of a character type can be converted to another type if the character value is a literal representation of the value in the target type. Special conversions are possible between numeric and interval types, which are described in the section covering interval types.
The examples below show examples of cast with their result:
 CAST (NULL AS TIMESTAMP)
 CAST ('   199  ' AS INTEGER) = 199
 CAST ('tRue ' AS BOOLEAN) = TRUE
 CAST (INTERVAL '2' DAY AS INTEGER) = 2
 CAST ('1992-04-21' AS DATE) = DATE '1992-04-21'
NEXT VALUE FOR
next value expression
<next value expression> ::= NEXT VALUE FOR
      <sequence generator name>
Return the next value of a sequence generator. This expression can be used as a select list element in queries, or in assignments to table columns in data change statements. If the expression is used more than once in a single row that is being evaluated, the same value is returned for each invocation. After evaluation of the particular row is complete, the sequence generator will return a different value from the old value. The new value is generated by the sequence generator by adding the increment to the last value it generated. In SQL syntax compatibility modes, variants of this expression in different SQL dialects are supported. In the example below the expression is used in an insert statement:
INSERT INTO MYTABLE(COL1, COL2) VALUES 2, NEXT VALUE FOR MYSEQUENCE
CURRENT VALUE FOR
current value expression
<current value expression> ::= CURRENT VALUE FOR
      <sequence generator name>
Return the latest value that was returned by the NEXT VALUE FOR expression for a sequence generator in this session. In the example below, the value that was generated by the sequence for the first insert, is reused for the second insert:
INSERT INTO MYTABLE(COL1, COL2) VALUES 2, NEXT VALUE FOR MYSEQUENCE; INSERT INTO CHILDTABLE(COL1, COL2) VALUES 10, CURRENT VALUE FOR MYSEQUENCE;
value expression
value expression
<value expression> ::= <numeric value
      expression> | <string value expression> | <datetime value
      expression> | <interval value expression> | <boolean value
      expression> | <row value expression>
An expression that returns a value. The value can be a single value, or a row consisting more than one value.
numeric value expression
numeric value expression
<numeric value expression> ::= <term> |
      <numeric value expression> <plus sign> <term> |
      <numeric value expression> <minus sign>
      <term>
<term> ::= <factor> | <term>
      <asterisk> <factor> | <term> <solidus>
      <factor>
<factor> ::= [ <sign> ] <numeric
      primary>
<numeric primary> ::= <value expression
      primary> | <numeric value function>
Specify a numeric value. The BNF indicates that
      <asterisk> and
      <solidus> (the operators for multiplication and
      division) have precedence over <minus sign> and
      <plus sign>.
numeric value function
numeric value function
<numeric value function> ::= <position
      expression> | <extract expression> | <length expression>
      ...
Specify a function yielding a value of type numeric. The supported numeric value functions are listed and described in the Built In Functions chapter.
string value expression
string value expression
<string value expression> ::= <string
      concatenation> | <string factor>
<string factor> ::= <value expression
      primary> | <string value function>
<string concatenation> ::= <string value
      expression> <concatenation operator> <string
      factor>
<concatenation operator> ::=
      ||
Specify a character string value, a binary string value, or a
      bit string value. The BNF indicates that a string value expression can
      be formed by concatenation of two or more <value expression
      primary>. The types of the <value expression
      primary> elements must be compatible, that is, all must be
      string, or binary or bit string values.
character value function
string value function
<string value function> ::=
      ...
Specify a function that returns a character string or binary string. The supported character value functions are listed and described in the Built In Functions chapter.
datetime value expression
datetime value expression
<datetime value expression> ::= <datetime
      term> | <interval value expression> <plus sign>
      <datetime term> | <datetime value expression> <plus
      sign> <interval term> | <datetime value expression>
      <minus sign> <interval term>
<datetime term> ::= <datetime
      factor>
<datetime factor> ::= <datetime primary> [
      <time zone> ]
<datetime primary> ::= <value expression
      primary> | <datetime value function>
<time zone> ::= AT <time zone
      specifier>
<time zone specifier> ::= LOCAL | TIME ZONE
      <interval primary>
Specify a datetime value. Details are described in the SQL Language chapter.
datetime value function
datetime value function
<datetime value function> ::=
      ...
Specify a function that returns a datetime value. The supported datetime value functions are listed and described in the Built In Functions chapter.
interval term
interval value expression
<interval value expression> ::= <interval
      term> | <interval value expression 1> <plus sign>
      <interval term 1> | <interval value expression 1> <minus
      sign> <interval term 1> | <left paren> <datetime value
      expression> <minus sign> <datetime term> <right
      paren> <interval qualifier>
<interval term> ::= <interval factor> |
      <interval term 2> <asterisk> <factor> | <interval
      term 2> <solidus> <factor> | <term>
      <asterisk> <interval factor>
<interval factor> ::= [ <sign> ]
      <interval primary>
<interval primary> ::= <value expression
      primary> [ <interval qualifier> ] | <interval value
      function>
<interval value expression 1> ::= <interval
      value expression>
<interval term 1> ::= <interval
      term>
<interval term 2> ::= <interval
      term>
Specify an interval value. Details are described in the SQL Language chapter.
interval absolute value function
interval value function
<interval value function> ::= <interval
      absolute value function>
<interval absolute value function> ::= ABS
      <left paren> <interval value expression> <right
      paren>
Specify a function that returns the absolute value of an interval. If the interval is negative, it is negated, otherwise the original value is returned.
boolean value expression
boolean value expression
<boolean value expression> ::= <boolean
      term> | <boolean value expression> OR <boolean
      term>
<boolean term> ::= <boolean factor> |
      <boolean term> AND <boolean factor>
<boolean factor> ::= [ NOT ] <boolean
      test>
<boolean test> ::= <boolean primary> [ IS
      [ NOT ] <truth value> ]
<truth value> ::= TRUE | FALSE |
      UNKNOWN
<boolean primary> ::= <predicate> |
      <boolean predicand>
<boolean predicand> ::= <parenthesized
      boolean value expression> | <nonparenthesized value expression
      primary>
<parenthesized boolean value expression> ::=
      <left paren> <boolean value expression> <right
      paren>
Specify a boolean value.
Predicates are conditions and evaluate to a boolean value. Some
      predicates have two sides. The left side of the predicate, the
      <row value predicand>, is the common element of
      all two-sided predicates. This element is a generalisation of both
      <value expression>, which is a scalar, and of
      <explicit row value constructor>, which is a
      row. The two sides of a predicate can be split in CASE expressions where
      the <row value predicand> is part of multiple
      predicates.
In the following example, a column of the table is the left side of two predicates in a CASE expression.
SELECT CASE city WHEN 'Oslo' THEN 'Scandinavia' WHEN IN ('Dallas', 'Boston') THEN 'America' ELSE '?' END FROM customer
The number of fields in all <row value
      predicand> used in predicates must be the same and the
      types of the fields in the same position must be compatible for
      comparison. If either of these conditions does not hold, an exception is
      raised. The number of fields in a row is called the
      degree.
In many types of predicates (but not all of them), if the
      <row value predicand> evaluates to NULL, the
      result of the predicate is UNKNOWN. If the <row value
      predicand> has more than one element, and one or more of
      the fields evaluate to NULL, the result depends on the particular
      predicate.
comparison predicand
comparison predicate
<comparison predicate> ::= <row value
      predicand> <comp op> <row value
      predicand>
<comp op> ::= <equals operator> | <not
      equals operator> | <less than operator> | <greater than
      operator> | <less than or equals operator> | <greater than
      or equals operator>
Specify a comparison of two row values. If either
      <row value predicand> evaluates to NULL, the
      result of <comparison predicate> is UNKNOWN.
      Otherwise, the result is TRUE, FALSE or UNKNOWN.
If the degree of <row value
      predicand> is larger than one, comparison is performed
      between each field and the corresponding field in the other
      <row value predicand> from left to right, one
      by one.
When comparing two elements, if either field is NULL then the result is UNKNOWN.
For <equals operator>, if the result
      of comparison is TRUE for all field, the result of the predicate is
      TRUE. If the result of comparison is FALSE for one field, the result of
      predicate is FALSE. Otherwise the result is UNKNOWN.
The <not equals operator> is
      translated to NOT (<row value predicand> = <row value
      predicand>).
The <less than or equals operator> is
      translated to (<row value predicand> = <row value
      predicand>) OR (<row value predicand> < <row value
      predicand>). The <greater than or equals
      operator> is translated similarly.
For the <less than operator> and
      <greater than operator>, if two fields at a
      given position are equal, then comparison continues to the next field.
      Otherwise, the result of the last performed comparison is returned as
      the result of the predicate. This means that if the first field is NULL,
      the result is always UNKNOWN.
The logic that governs NULL values and UNKNOWN result is as follows: Suppose the NULL values were substituted by arbitrary real values. If substitution cannot change the result of the predicate, then the result is TRUE or FALSE, based on the existing non-NULL values, otherwise the result of the predicate is UNKNOWN.
The examples of comparison given below use literals, but the literals actually represent the result of evaluation of some expression.
((1, 2, 3, 4) = (1, 2, 3, 4)) IS TRUE ((1, 2, 3, 4) = (1, 2, 3, 5)) IS FALSE ((1, 2, 3, 4) < (1, 2, 3, 4)) IS FALSE ((1, 2, 3, 4) < (1, 2, 3, 5)) IS TRUE ((NULL, 1, NULL) = (NULL, 1, NULL)) IS UNKNOWN ((NULL, 1, NULL) = (NULL, 2, NULL)) IS FALSE ((NULL, 1, NULL) <> (NULL, 2, NULL)) IS TRUE ((NULL, 1, 2) <all operators> (NULL, 1, 2)) IS UNKNOWN ((1, NULL, ...) < (1, 2, ...)) IS UNKNOWN ((1, NULL, ...) < (2, NULL, ...)) IS TRUE ((2, NULL, ...) < (1, NULL, ...)) IS FALSE
BETWEEN
between predicate
<between predicate> ::= <row value
      predicand> <between predicate part 2>
<between predicate part 2> ::= [ NOT ] BETWEEN [
      ASYMMETRIC | SYMMETRIC ] <row value predicand> AND <row value
      predicand>
Specify a range comparison. The default is ASYMMETRIC. The
      expression X BETWEEN Y AND Z is equivalent to
      (X >= Y AND X <= Z). Therefore, if Y > Z,
      the BETWEEN expression is never true. The expression X BETWEEN
      SYMMETRIC Y AND Z is equivalent to (X >= Y AND X
      <= Z) OR (X >= Z AND X <= Y). The expression
      Z NOT BETWEEN ... is equivalent to NOT (Z
      BETWEEN ...). If any of the three <row value
      predicand> evaluates to NULL, the result is
      UNKNOWN.
IN
in predicate
<in predicate> ::= <row value predicand> [
      NOT ] IN <in predicate value>
<in predicate value> ::= <table subquery>
      | <left paren> <in value list> <right paren>
      
| <left paren> UNNEST <left paren>
      <array value expression> <right paren> <right
      paren>
<in value list> ::= <row value expression>
      [ { <comma> <row value expression> }...
      ]
Specify a quantified comparison. The expression X NOT
      IN Y is equivalent to NOT (X IN Y). The
      ( <in value list> ) is converted into a table
      with one or more rows. The expression X IN Y is
      equivalent to X = ANY Y, which is a
      <quantified comparison predicate>.
If the <table subquery> returns no
      rows, the result is FALSE. Otherwise the <row value
      predicand> is compared one by one with each row of the
      <table subquery>.
If the comparison is TRUE for at least one row, the result is TRUE. If the comparison is FALSE for all rows, the result is FALSE. Otherwise the result is UNKNOWN.
HyperSQL supports an extension to the SQL Standard to allow an
      array to be used in the <in predicate value>. This is intended to
      be used with prepared statements where a variable length array of values
      can be used as the parameter value for each call. The example below
      shows how this is used in SQL. The JDBC code must create a new
      java.sql.Array object that contains the values and
      set the parameter with this array.
 SELECT * FROM customer WHERE firstname IN ( UNNEST(?) )
 Connection conn;
 PreparedStatement ps;
 // conn and ps are instantiated here
 Array arr = conn.createArrayOf("INTEGER", new Integer[] {1, 2, 3});
 ps.setArray(1, arr);
 ResultSet rs = ps.executeQuery();
LIKE
like predicate
<like predicate> ::= <character like
      predicate> | <octet like predicate>
<character like predicate> ::= <row value
      predicand> [ NOT ] LIKE <character pattern> [ ESCAPE <escape
      character> ]
<character pattern> ::= <character value
      expression>
<escape character> ::= <character value
      expression>
<octet like predicate> ::= <row value
      predicand> [ NOT ] LIKE <octet pattern> [ ESCAPE <escape
      octet> ]
<octet pattern> ::= <binary value
      expression>
<escape octet> ::= <binary value
      expression>
Specify a pattern-match comparison for character or binary
      strings. The <row value predicand> is always a
      <string value expression> of character or
      binary type. The <character pattern> or
      <octet pattern> is a <string value
      expression> in which the underscore and percent characters
      have special meanings. The underscore means match any one character,
      while the percent means match a sequence of zero or more characters. The
      <escape character> or <escape
      octet> is also a <string value
      expression> that evaluates to a string of exactly one
      character length. If the underscore or the percent is required as normal
      characters in the pattern, the specified <escape
      character> or <escape octet> can
      be used in the pattern before the underscore or the percent. The
      <row value predicand> is compared with the
      <character pattern> and the result of
      comparison is returned. If any of the expressions in the predicate
      evaluates to NULL, the result of the predicate is UNKNOWN. The
      expression A NOT LIKE B is equivalent to NOT
      (A LIKE B). If the length of the escape is not 1 or it is used
      in the pattern not immediately before an underscore or a percent
      character, an exception is raised.
IS NULL
null predicate
<null predicate> ::= <row value predicand>
      IS [ NOT ] NULL
Specify a test for a null value. The expression X IS
      NOT NULL is NOT equivalent to NOT (X IS
      NULL)if the degree of the <row value
      predicand> is larger than 1. The rules are: If all fields
      are null, X IS NULL is TRUE and X IS NOT
      NULL is FALSE. If only some fields are null, both X
      IS NULL and X IS NOT NULL are FALSE. If all
      fields are not null, X IS NULL is FALSE and
      X IS NOT NULL is TRUE.
ALL and ANY
quantified comparison predicate
<quantified comparison predicate> ::= <row
      value predicand> <comp op> <quantifier> <table
      subquery>
<quantifier> ::= <all> |
      <some>
<all> ::= ALL
<some> ::= SOME | ANY
Specify a quantified comparison. For a quantified comparison,
      the <row value predicand> is compared one by
      one with each row of the <table sub
      query>.
If the <table subquery> returns no
      rows, then if ALL is specified the result is TRUE,
      but if SOME or ANY is specified
      the result is FALSE.
If ALL is specified, if the comparison is
      TRUE for all rows, the result of the predicate is TRUE. If the
      comparison is FALSE for at least one row, the result is FALSE. Otherwise
      the result is UNKNOWN.
If SOME or ANY is
      specified, if the comparison is TRUE for at least one row, the result is
      TRUE. If the comparison is FALSE for all rows, the result is FALSE.
      Otherwise the result is UNKNOWN. Note that the IN predicate is
      equivalent to the SOME or ANY predicate using the <equals
      operator>.
In the examples below, the date of an invoice is compared to holidays in a given year. In the first example the invoice date must equal one of the holidays, in the second example it must be later than all holidays (later than the last holiday), in the third example it must be on or after some holiday (on or after the first holiday), and in the fourth example, it must be before all holidays (before the first holiday).
invoice_date = SOME (SELECT holiday_date FROM holidays) invoice_date > ALL (SELECT holiday_date FROM holidays) invoice_date >= ANY (SELECT holiday_date FROM holidays) invoice_date < ALL (SELECT holiday_date FROM holidays)
EXISTS
exists predicate
<exists predicate> ::= EXISTS <table
      subquery>
Specify a test for a non-empty set. If the evaluation of
      <table subquery> results in one or more rows,
      then the expression is TRUE, otherwise FALSE.
UNIQUE
unique predicate
<unique predicate> ::= UNIQUE <table
      subquery>
Specify a test for the absence of duplicate rows. The result of
      the test is either TRUE or FALSE (never UNKNOWN). The rows of the
      <table subquery> that contain one or more NULL
      values are not considered for this test. If the rest of the rows are
      distinct from each other, the result of the test is TRUE, otherwise it
      is FALSE. The distinctness of rows X and Y is tested with the predicate
      X IS DISTINCT FROM Y.
MATCH
match predicate
<match predicate> ::= <row value
      predicand> MATCH [ UNIQUE ] [ SIMPLE | PARTIAL | FULL ] <table
      subquery>
Specify a test for matching rows. The default is MATCH SIMPLE without UNIQUE. The result of the test is either TRUE or FALSE (never UNKNOWN).
The interpretation of NULL values is different from other
      predicates and quite counter-intuitive. If the <row value
      predicand> is NULL, or all of its fields are NULL, the
      result is TRUE.
Otherwise, the <row value predicand>
      is compared with each row of the <table
      subquery>.
If SIMPLE is specified, if some field of <row value
      predicate> is NULL, the result is TRUE. Otherwise if
      <row value predicate> is equal to one or more
      rows of <table subquery> the result is TRUE if
      UNIQUE is not specified, or if UNIQUE is specified and only one row
      matches. Otherwise the result is FALSE.
If PARTIAL is specified, if the non-null values
      <row value predicate> are equal to those in one
      or more rows of <table subquery> the result is
      TRUE if UNIQUE is not specified, or if UNIQUE is specified and only one
      row matches. Otherwise the result is FALSE.
If FULL is specified, if some field of <row value
      predicate> is NULL, the result is FALSE. Otherwise if
      <row value predicate> is equal to one or more
      rows of <table subquery> the result is TRUE if
      UNIQUE is not specified, or if UNIQUE is specified and only one row
      matches.
Note that MATCH can also be used in FOREIGN KEY constraint definitions. The exact meaning is described in the Schemas and Database Objects chapter.
CONTAINS
contains predicate
<contains predicate> ::= PERIOD <row value
      predicand> CONTAINS PERIOD <row value
      predicand>
Specify a test for two datetime periods. Each <row
      value predicand> must have two fields and the fields
      together represent a datetime period. So the predicates is always in the
      form PERIOD (X1, X2) CONTAINS PERIOD (Y1, Y2). Fields
      in each period are always a datetime value of the same type (DATE or
      TIMESTAMP).
All datetime values are converted to TIMESTAMP WITH TIME ZONE. The second datetime value must be after the first, otherwise a data error is returned.
If the second period is fully within the first period, the result is TRUE. Otherwise it is false.
If any of the values is NULL, the result is UNDEFINED.
EQUALS
equals predicate
<equals predicate> ::= PERIOD <row value
      predicand> EQUALS PERIOD <row value
      predicand>
Specify a test for two datetime periods. The conversions and checks are applied the same way as with the CONTAINS predicate. If the two periods have the same begin and end datetime values the result is TRUE. Otherwise it is false.
If any of the values is NULL, the result is UNDEFINED.
IS DISTINCT
is distinct predicate
<distinct predicate> ::= <row value
      predicand> IS [ NOT ] DISTINCT FROM <row value
      predicand>
Specify a test of whether two row values are distinct. The
      result of the test is either TRUE or FALSE (never UNKNOWN). The
      degree the two <row value
      predicand> must be the same. Each field of the first
      <row value predicand> is compared to the field
      of the second <row value predicand> at the same
      position. If one field is NULL and the other is not NULL, or if the
      elements are NOT equal, then the result of the expression is TRUE. If no
      comparison result is TRUE, then the result of the predicate is FALSE.
      The expression X IS NOT DISTINCT FROM Y is equivalent
      to NOT (X IS DISTINCT FORM Y). The following check
      returns true if startdate is not equal to enddate. It also returns true
      if either startdate or enddate is NULL. It returns false in other
      cases.
startdate IS DISTINCT FROM enddate
OVERLAPS
overlaps predicate
<overlaps predicate> ::= <row value
      predicand> OVERLAPS <row value predicand>
<overlaps predicate> ::= PERIOD <row value
      predicand> OVERLAPS PERIOD <row value
      predicand>
The OVERLAPS predicate tests for an overlap between two datetime periods. This predicate has two forms. The one without the PERIOD keywords is more relaxed in terms of valid periods.
If there is there is any overlap between the two datetime periods, the result is TRUE. Otherwise it is false.
If any of the values is NULL, the result is UNDEFINED.
In the example below, the period is compared with a week long period ending yesterday.
(startdate, enddate) OVERLAPS (CURRENT_DATE - 7 DAY, CURRENT_DATE - 1 DAY)
PRECEDES
precedes predicate
<precedes predicate> ::= PERIOD <row value
      predicand> [ IMMEDIATELY] PRECEDES PERIOD <row value
      predicand>
Specify a test for two datetime periods. The conversions and checks are applied the same way as with the CONTAINS predicate. If the second period begins after the end of the first period, the result is TRUE. Otherwise it is false.
If IMMEDIATELY is specified, the second period must follow immediately after the end of the first period. This means the end of the first period is the same point of time as the start of the second period.
If any of the values is NULL, the result is UNDEFINED.
SUCCEEDS
succeeds predicate
<succeeds predicate> ::= PERIOD <row value
      predicand> [ IMMEDIATELY ] SUCCEEDS PEDIOD <row value
      predicand>
Specify a test for two datetime periods with similar syntax to PRECEDES. If the first period begins after the end of the second period, the result is TRUE. Otherwise it is false.
If IMMEDIATELY is specified, the first period must follow immediately after the end of the second period.
If any of the values is NULL, the result is UNKNOWN.
The example below shows a predicate that returns TRUE.
PERIOD (CURRENT_DATE - 7 DAY, CURRENT_DATE) IMMEDIATELY PRECEDES (CURRENT_DATE, CURRENT_DATE + 7 DAY)
aggregate function
aggregate function
<aggregate function> ::= COUNT <left
      paren> <asterisk> <right paren> [ <filter clause> ]
      | <general set function> [ <filter clause> ] | <array
      aggregate function> [ <filter clause> ]
<general set function> ::= <set function
      type> <left paren> [ <set quantifier> ] <value
      expression> <right paren>
<set function type> ::= <computational
      operation>
<computational operation> ::= AVG | MAX | MIN |
      SUM | EVERY | ANY | SOME | COUNT | STDDEV_POP | STDDEV_SAMP | VAR_SAMP |
      VAR_POP | MEDIAN
<set quantifier> ::= DISTINCT |
      ALL
<filter clause> ::= FILTER <left paren>
      WHERE <search condition> <right paren>
<array aggregate function> ::= ARRAY_AGG
      <left paren> [ <set quantifier> ] <value expression> [
      <order by clause> ] <right paren>
<group concat function> ::= GROUP_CONCAT
      <left paren> [ <set quantifier> ] <value expression> [
      <order by clause> ] [ SEPARATOR <separator> ] <right
      paren>
<separator> ::= <character string
      literal>
Specify a value computed from a collection of rows.
An aggregate function is used exclusively in a
      <query specification> and its use transforms a
      normal query into an aggregate query returning a single row instead of
      the multiple rows that the original query returns. For example,
      SELECT acolumn <table expression> is a query
      that returns the value of acolumn for all the rows the satisfy the given
      condition. But SELECT MAX(acolumn) <table
      expression> returns only one row, containing the largest
      value in that column. The query SELECT COUNT(*) <table
      expression> returns the count of rows, while
      SELECT COUNT(acolumn) <table expression>
      returns the count of rows where acolumn IS NOT
      NULL.
If the <table expression> is a grouped
      table (has a GROUP BY clause), the aggregate function
      returns the result of the COUNT or
      <computational operation> for each group. In
      this case the result has the same number of rows as the original grouped
      query. For example, SELECT SUM(acolumn) <table
      expression> when <table
      expression> has a GROUP BY clause,
      returns the sum of values for acolumn in each
      group.
If all values are NULL, the aggregate function (except COUNT) returns NULL.
The SUM operations can be performed on numeric and interval expressions only. AVG and MEDIAN can be performed on numeric, interval or datetime expressions. AVG returns the average value, while SUM returns the sum of all values. MEDIAN returns the middle value in the sorted list of values.
MAX and MIN can be performed on all types of expressions and return the minimum or the maximum value.
COUNT(*) returns the count of all values,
      including nulls, while COUNT(<value
      expression>) returns the count of non-NULL values. COUNT
      with DISTINCT also accepts multiple arguments. In this usage the
      distinct combinations of the arguments are counted. Examples
      below:
SELECT COUNT(DISTINCT firstname, lastname) FROM customer SELECT COUNT(DISTINCT (firstname, lastname)) FROM customer
The EVERY, ANY and SOME operations can be performed on boolean expressions only. EVERY returns TRUE if all the values are TRUE, otherwise FALSE. ANY and SOME are the same operation and return TRUE if one of the values is TRUE, otherwise it returns FALSE.
The other operations perform the statistical functions STDDEV_POP, STDDEV_SAMP, VAR_SAMP, VAR_POP on numeric values. NULL values are ignored in calculations.
User-defined aggregate functions can be defined and used instead of the built-in aggregate functions. Syntax and examples are given in the SQL-Invoked Routines chapter.
The <filter clause> allows you to add a
      search condition. When the search condition evaluates to TRUE for a row,
      the row is included in aggregation. Otherwise the row is not included.
      In the example below a single query returns two different filtered
      counts:
SELECT COUNT(ITEM) FILTER (WHERE GENDER = 'F') AS "FEMALE COUNT", COUNT(ITEM) FILTER (WHERE GENDER = 'M') AS "MALE COUNT" FROM PEOPLE
ARRAY_AGG is different from all other aggregate functions, as
      it does not ignore the NULL values. This set function returns an array
      that contains all the values, for different rows, for the
      <value expression>. For example, if the
      <value expression> is a column reference, the
      SUM function adds the values for all the row together, while the
      ARRAY_AGG function adds the value for each row as a separate element of
      the array. ARRAY_AGG can include an optional <order by
      clause>. If this is used, the elements of the returned
      array are sorted according to the <order by
      clause>, which can reference all the available columns of
      the query, not just the <value expression> that
      is used as the ARRAY_AGG argument. The <order by
      clause> can have multiple elements (columns) and each
      element can include NULLS LAST or DESC qualifiers. No
      <separator> is used with this
      function.
GROUP_CONCAT is a specialised function derived from ARRAY_AGG.
      This function computes the array in the same way as ARRAY_AGG, removes
      all the NULL elements, then returns a string that is a concatenation of
      the elements of the array. If <separator> has
      been specified, it is used to separate the elements of the array.
      Otherwise the comma is used to separate the elements.
The example below shows a grouped query with ARRAY_AGG and GROUP_CONCAT. The CUSTOMER table that is included for tests in the DatabaseManager GUI app is the source of the data.
SELECT LASTNAME, ARRAY_AGG(FIRSTNAME ORDER BY FIRSTNAME) FROM Customer GROUP BY LASTNAME LASTNAME C2 --------- ---------------------------------------------------------- Steel ARRAY['John','John','Laura','Robert'] King ARRAY['George','George','James','Julia','Robert','Robert'] Sommer ARRAY['Janet','Robert'] SELECT LASTNAME, GROUP_CONCAT(DISTINCT FIRSTNAME ORDER BY FIRSTNAME DESC SEPARATOR ' * ') FROM Customer GROUP BY LASTNAME LASTNAME C2 --------- ------------------------------------------------- Steel Robert * Laura * John King Robert * Julia * James * George Sommer Robert * Janet
search condition
search condition
<search condition> ::= <boolean value
      expression>
Specify a condition that is TRUE, FALSE, or UNKNOWN. A search condition is often a predicate.
PATH
path specification
<path specification> ::= PATH <schema name
      list>
<schema name list> ::= <schema name> [ {
      <comma> <schema name> }... ]
Specify an order for searching for a user-defined SQL-invoked routine. This is not currently supported by HyperSQL.
routine invocation
routine invocation
<routine invocation> ::= <routine name>
      <SQL argument list>
<routine name> ::= [ <schema name>
      <period> ] <qualified identifier>
<SQL argument list> ::= <left paren> [
      <SQL argument> [ { <comma> <SQL argument> }... ] ]
      <right paren>
<SQL argument> ::= <value expression> |
      <target specification>
Invoke an SQL-invoked routine. Examples are given in the SQL-Invoked Routines chapter.
COLLATE
collate clause
<collate clause> ::= COLLATE <collation
      name>
Specify a collation for a column or for an ORDER BY expression. This collation is used for comparing the values of the column in different rows. Comparison can happen during the execution of SELECT, UPDATE or DELETE statements, when a UNIQUE constraint or index is defined on the column, or when the rows are sorted by an ORDER BY clause.
CONSTRAINT
constraint name definition
<constraint name definition> ::= CONSTRAINT
      <constraint name>
<constraint characteristics> ::= <constraint
      check time> [ [ NOT ] DEFERRABLE ] | [ NOT ] DEFERRABLE [
      <constraint check time> ]
<constraint check time> ::= INITIALLY DEFERRED |
      INITIALLY IMMEDIATE
Specify the name of a constraint and its characteristics. This is an optional element of CONSTRAINT definition, not yet supported by HyperSQL.
HyperSQL fully supports all of SQL-92 data access statements, plus most of the additions from SQL:2011. Due to time constraints, the current version of this Guide does not cover the subject fully. You are advised to consult an SQL book such as the O'Reilly title, "SQL and Relational Theory" by C. J. Date.
Database queries are data access statements. The most commonly used data access statement is the SELECT statement, but there are other statements that perform a similar role. Data access statements access tables and return result tables. The returned result tables are falsely called result sets, as they are not necessarily sets of rows, but multisets of rows.
Result tables are formed by performing the following operations on base tables and views. These operations are loosely based on Relational Algebra.
JOIN operations
SET and MULTISET operations
SELECTION
PROJECTION
COMPUTING
COLUMN NAMING
GROUPING and AGGREGATION
SELECTION AFTER GROUPING OR AGGREGATION
SET and MULTISET (COLLECTION) OPERATIONS
ORDERING
SLICING
Conceptually, the operations are performed one by one in the above order if they apply to the given data access statement. In the example below a simple select statement is made more complex by adding various operations.
CREATE TABLE atable (a INT, b INT, c INT, d INT, e INT, f INT); /* in the next SELECT, no join is performed and no further operation takes place */ SELECT * FROM atable /* in the next SELECT, selection is performed by the WHERE clause, with no further action */ SELECT * FROM atable WHERE a + b = c /* in the next SELECT, projection is performed after the other operations */ SELECT d, e, f FROM atable WHERE a + b = c /* in the next SELECT, computation is performed after projection */ SELECT (d + e) / f FROM atable WHERE a + b = c /* in the next two SELECT statements, column naming is performed in different ways*/ SELECT (a + e) / f AS calc, f AS div FROM atable WHERE a + b = c SELECT dcol, ecol, fcol FROM atable(acol, bcol, ccol, dcol, ecol, fcol) WHERE acol + bcol = ccol /* in the next SELECT, both grouping and aggregation is performed */ SELECT d, e, SUM(f) FROM atable GROUP BY d, e /* in the next SELECT, selection is performed after grouping and aggregation is performed */ SELECT d, e, SUM(f) FROM atable GROUP BY d, e HAVING SUM(f) > 10 /* in the next SELECT, a UNION is performed on two selects from the same table */ SELECT d, e, f FROM atable WHERE d = 3 UNION SELECT a, b, c FROM atable WHERE a = 30 /* in the next SELECT, ordering is performed */ SELECT (a + e) / f AS calc, f AS div FROM atable WHERE a + b = c ORDER BY calc DESC, div NULLS LAST /* in the next SELECT, slicing is performed after ordering */ SELECT * FROM atable WHERE a + b = c ORDER BY a FETCH 5 ROWS ONLY
The following sections discuss various types of tables and operations involved in data access statements.
The SELECT statement itself does not cover all types of data
      access statements, which may combine multiple SELECT statements. The
      <query specification> is the most common data
      access statement and begins with the SELECT keyword.
SELECT STATEMENT
select statement (general)
Users generally refer to the SELECT statement when they mean a
      <query specification> or <query
      expression>. If a statement begins with SELECT and has no
      UNION or other set operations, then it is a <query
      specification>. Otherwise it is a <query
      expression>.
In data access statements, a table can be a database table (or
      view) or an ephemeral table formed for the duration of the query. Some
      types of table are <table primary> and can
      participate in joins without the use of extra parentheses. The BNF in
      the Table Primary section below lists different types of <table
      primary>:
Tables can also be formed by specifying the values that are contained in them:
<table value constructor> ::= VALUES <row
      value expression list>
<row value expression list> ::= <table row
      value expression> [ { <comma> <table row value
      expression> }... ]
In the example below a table with two rows and 3 columns is constructed out of some values:
VALUES (12, 14, null), (10, 11, CURRENT_DATE)
When a table is used directly in a UNION or similar operation, the keyword TABLE is used with the name:
<explicit table> ::= TABLE <table or query
      name>
In the examples below, all rows of the two tables are included in the union. The keyword TABLE is used in the first example. The two examples below are equivalent.
TABLE atable UNION TABLE anothertable SELECT * FROM atable UNION SELECT * FROM anothertable
A subquery is simply a query expression in brackets. A query expression is usually a complete SELECT statement and is discussed in the rest of this chapter. A scalar subquery returns one row with one column. A row subquery returns one row with one or more columns. A table subquery returns zero or more rows with one or more columns. The distinction between different forms of subquery is syntactic. Different forms are allowed in different contexts. If a scalar subquery or a row subquery return more than one row, an exception is raised. If a scalar or row subquery returns no row, it is usually treated as returning a NULL. Depending on the context, this has different consequences.
<scalar subquery> ::= <subquery>
      
<row subquery> ::= <subquery>
      
<table subquery> ::= <subquery>
      
<subquery> ::= <left paren> <query
      expression> <right paren>
A query specification is also known as a SELECT statement. It is
      the most common form of <derived table> . A
      <table expression> is a base table, a view or
      any form of allowed derived table. The SELECT statement performs
      projection, naming, computing, or aggregation on the rows of the
      <table expression> .
<query specification> ::= SELECT [ DISTINCT |
      ALL ] <select list> <table expression>
<select list> ::= <asterisk> | <select
      sublist> [ { <comma> <select sublist> }... ]
      
<select sublist> ::= <derived column> |
      <qualified asterisk> 
<qualified asterisk> ::= <asterisked
      identifier chain> <period> <asterisk>
<asterisked identifier chain> ::= <asterisked
      identifier> [ { <period> <asterisked identifier> }... ]
      
<asterisked identifier> ::=
      <identifier>
<derived column> ::= <value expression> [
      <as clause> ] 
<as clause> ::= [ AS ] <column name>
      
The qualifier DISTINCT or ALL apply to the results of the SELECT statement after all other operations have been performed. ALL simply returns the rows, while DISTINCT compares the rows and removes the duplicate ones.
Projection is performed by the <select
      list>.
A single <asterisk> means all columns of
      the <table expression> are included, in the
      same order as they appear in the <table
      expression>. An asterisk qualified by a table name means
      all the columns of the qualifier table name are included. If an
      unqualified asterisk is used, then no other items are allowed in the
      <select list>. When the <table
      expression> is the direct result of NATURAL or USING joins,
      the use of <asterisk> includes the columns used
      for the join before the other columns. A qualified asterisk does not
      cover the join columns.
A derived column is a <value expression>,
      optionally named with the <as clause>. A
      <value expression> can be many things. Common
      types include: the name of a column in the <table
      expression>; an expression based on different columns or
      constant values; a function call; an aggregate function; a CASE WHEN
      expression.
A table expression is part of the SELECT statement and consists of the FROM clause with optional other clauses that performs selection (of rows) and grouping from the table(s) in the FROM clause.
<table expression> ::= <from clause> [
      <where clause> ] [ <group by clause> ] [ <having
      clause> ]
<from clause> ::= FROM <table reference> [ {
      <comma> <table reference> }... ]
<table reference> ::= <table primary> |
      <joined table> 
<table primary> ::= <table or query name>
      [ <query system time period specification> ] [ [ AS ]
      <correlation name> [ <left paren> <derived column
      list> <right paren> ] ] 
| <derived table> [ AS ] <correlation
      name> [ <left paren> <derived column list> <right
      paren> ] 
| <lateral derived table> [ AS ] <correlation
      name> [ <left paren> <derived column list> <right
      paren> ] 
| <collection derived table> [ AS ]
      <correlation name> [ <left paren> <derived column
      list> <right paren> ] 
| <table function derived table> [ AS ]
      <correlation name> [ <left paren> <derived column
      list> <right paren> ] 
| <parenthesized joined table> [ AS ]
      <correlation name> [ <left paren> <derived column
      list> <right paren> ] 
<where clause> ::= WHERE <boolean value
      expression>
<group by clause> ::= GROUP BY [ <set
      quantifier> ] <grouping element> [ { <comma> <grouping
      element> }... ]
<having clause> ::= HAVING <boolean value
      expression>
<query system time period specification> ::= FOR
      SYSTEM_TIME AS OF <point in time 1>
| FOR SYSTEM_TIME BETWEEN [ SYMMETRIC ] <point in
      time 1> AND <point in time 2> 
| FOR SYSTEM_TIME FROM <point in time 1> TO
      <point in time 2>
The <from clause> contains one or more
      <table reference> separated by commas. A table
      reference is often a table or view name or a joined table.
The <where clause> filters the rows of
      the table in the <from clause> and removes the rows for which the
      search condition is not TRUE.
Table primary refers to different forms of table reference in the FROM clause.
The simplest form of reference is simply a name. This is the name of a table, a view, a transition table in a trigger definition, or a query name specified in the WITH clause of a query expression.
<table or query name> ::= <table name> |
        <transition table name> | <query name>
The <query system time period
        specification> can be used after the name of a
        system-versioned table to query historic data in the table. Without
        this clause, only the current rows of the table are returned and
        historic rows are ignored. The first example below shows a list of
        customers as of a year ago. The second example also shows any changes
        made to the email column over the previous year.
SELECT firstname, lastname, email FROM customer FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP - 1 YEAR SELECT DISTINCT firstname, lastname, email FROM customer FOR SYSTEM_TIME FROM CURRENT_TIMESTAMP - 2 YEAR TO CURRENT_TIMESTAMP - 1 YEAR
derived table
A query expression that is enclosed in parentheses and returns
        from zero to many rows is a <table subquery>.
        In a <derived table> the query expression is
        self contained and cannot reference the columns of other table
        references. This is the traditional and most common form of use of a
        <table subquery>.
<derived table> ::= <table
        subquery>
LATERAL
When the word LATERAL is used before a <table
        subquery>, it means the query expression can reference
        the columns of other table references that go before it.
<lateral derived table> ::= LATERAL <table
        subquery>
The use of <lateral derived table>
        completely transforms the way a query is written. For example, the two
        queries below are equivalent, but with different forms. The query with
        LATERAL is evaluated separately for each row of the first table that
        satisfies the WHERE condition. The example below uses the tables that
        are created and populated in DatabaseManagerSwing with the "Insert
        test data" menu option. The first query uses a scalar subquery to
        compute the sum of invoice values for each customer. The second query
        is equivalent and uses a join with a LATERAL table.
SELECT firstname, lastname, (SELECT SUM(total) FROM invoice WHERE customerid = customer.id) s FROM customer SELECT firstname, lastname, a.c FROM customer, LATERAL(SELECT SUM(total) FROM invoice WHERE customerid = customer.id) a (c)
UNNEST
UNNEST is similar to LATERAL, but instead of a query expression, one or more expressions that return an array are used. These expressions are converted into a table which has one column for each expression and contains the elements of the array. If WITH ORDINALITY is used, an extra column that contains the index of each element is added to this table. The number of rows in the table equals the length of the largest arrays. The smaller arrays are padded with NULL values. If an <array value expression> evaluates to NULL, an empty array is used in its place. The array expression can contain references to any column of the table references preceding the current table reference.
<collection derived table> ::= UNNEST <left
        paren> <array value expression>, ... <right paren> [
        WITH ORDINALITY ]
The <array value expression> can be the
        result of a function call. If the arguments of the function call are
        values from the tables on the left of the UNNEST, then the function is
        called for each row of table.
In the first example below, UNNEST is used with the built in-function SEQUENCE_ARRAY to build a table containing dates for the last seven days and their ordinal position. In the second example, a select statement returns costs for the last seven days . In the third example, the WITH clause turns the two selects into named subqueries which are used in a SELECT statement that uses a LEFT join.
SELECT * FROM UNNEST(SEQUENCE_ARRAY(CURRENT_DATE - 7 DAY, CURRENT_DATE - 1 DAY, 1 DAY)) WITH ORDINALITY AS T(D, I) D I ---------- - 2010-07-25 1 2010-07-26 2 2010-07-27 3 2010-07-28 4 2010-07-29 5 2010-07-30 6 2010-07-31 7 CREATE TABLE expenses (item_date DATE, cost DECIMAL(8,2)) -- SELECT item_date, SUM(cost) AS s FROM expenses WHERE item_date >= CURRENT_DATE - 7 DAY GROUP BY item_date ITEM_DATE S ---------- ------ 2010-07-27 100.12 2010-07-29 50.45 WITH costs(i_d, s) AS (SELECT item_date, SUM(cost) AS s FROM expenses WHERE item_date >= CURRENT_DATE - 7 DAY GROUP BY item_date), dates(d, i) AS (SELECT * FROM UNNEST(SEQUENCE_ARRAY(CURRENT_DATE - 7 DAY, CURRENT_DATE - 1 DAY, 1 DAY)) WITH ORDINALITY) SELECT i, d, s FROM dates LEFT OUTER JOIN costs ON dates.d = costs.i_d I D S - ---------- ------ 1 2010-07-25 (null) 2 2010-07-26 (null) 3 2010-07-27 100.12 4 2010-07-28 (null) 5 2010-07-29 50.45 6 2010-07-30 (null) 7 2010-07-31 (null)
Table Function Derived Table
When TABLE is used in this context, the <collection
        value expression> must be the result of a function call
        to a built-in function or user-defined function that returns an array
        or a table. When the function returns an array, this array is
        converted into a table, similar to the way UNNEST operates. When the
        function returns a table, the result is a MULTISET and is used as
        is.
<table function derived table> ::= TABLE <left
        paren> <collection value expression> <right
        paren>
A parenthesized joined table is simply a joined table contained in parentheses. Joined tables are discussed below.
<parenthesized joined table> ::= <left
        paren> <parenthesized joined table> <right paren> |
        <left paren> <joined table> <right
        paren>
Joins are operators with two table as the operands, resulting in a third table, called joined table. All join operators are evaluated left to right, therefore, with multiple joins, the table resulting from the first join operator becomes an operand of the next join operator. Parentheses can be used to group sequences of joined tables and change the evaluation order. So if more than two tables are joined together with join operators, the end result is also a joined table. There are different types of join, each producing the result table in a different way.
<joined table> ::= <cross join> |
      <qualified join> | <natural join>
<cross join> ::= <table reference> CROSS JOIN
      <table factor> 
<qualified join> ::= <table reference> | [
      <join type> ] JOIN <table reference> <join
      specification>
<natural join> ::= <table reference> NATURAL
      [ <join type> ] JOIN <table factor>
<join specification> ::= <join condition> |
      <named columns join>
<join condition> ::= ON <search
      condition>
<named columns join> ::= USING <left paren>
      <join column list> <right paren>
<join type> ::= INNER | <outer join type> [
      OUTER ] 
<outer join type> ::= LEFT | RIGHT |
      FULL
<join column list> ::= <column name
      list>
CROSS JOIN
The simplest form of join is CROSS JOIN. The CROSS JOIN of two tables is a table that has all the columns of the first table, followed by all the columns of the second table, in the original order. Each row of the first table is combined with each row of the second table to fill the rows of the new table. If the rows of each table form a set, then the rows of the CROSS JOIN table form the Cartesian product of the rows of the two table operands.
Conditions are not allowed as part of a cross join, which is
      simply A CROSS JOIN B. Any conditions in a WHERE
      clause are later applied to the table resulting from the cross
      join.
Tables in the FROM CLAUSE separated with commas, are equivalent to
      cross joins between the tables. Two joined tables separated with a
      comma, such as A, B, is equivalent to (A) CROSS JOIN
      (B), which means the joined tables A and B are populated separately
      before they are joined.
CROSS JOIN is not is not generally very useful, as it returns large result tables unless WHERE conditions are used.
UNION JOIN
The UNION JOIN has limited use in queries. The result table has
      the same columns as that of CROSS JOIN. Each row of the first table is
      extended to the right with nulls and added to the new table. Each row of
      the second table is extended to the left with nulls and added to the new
      table. The UNION JOIN is expressed as A UNION JOIN B.
      This should not be confused with A UNION B, which is
      a set operation. Union join is for special applications and is not
      commonly used.
JOIN ... ON
The condition join is similar to CROSS JOIN, but a condition is
      tested for each row of the new table and the row is created only if the
      condition is true. This form of join is expressed as A JOIN B
      ON (<search condition>).
Equijoin is a condition join in which the search condition is an equality condition between on or more pairs of columns from the two table. Equijoin is the most commonly used type of join.
SELECT a.*, b.* FROM a INNER JOIN b ON a.col_one = b.col_two
JOIN ... USING
NATURAL JOIN
Joins with USING or NATURAL keywords are similar to an equijoin
      but they cannot be replaced simply with an equijoin. The new table is
      formed with the specified or implied shared columns of the two tables,
      followed by the rest of the columns from each table. In NATURAL JOIN,
      the shared columns are all the column pairs that have the same name in
      the first and second table. In JOIN USING, only columns names that are
      specified by the USING clause are shared. The joins are expressed as
      A NATURAL JOIN B, and A JOIN B USING
      (<comma separated column name list>).
The columns of the joined table are formed by the following procedures: In JOIN ... USING the shared columns are added to the joined table in the same order as they appear in the column name list. In NATURAL JOIN the shared columns are added to the joined table in the same order as they appear in the first table. In both forms of join, the non-shared columns of the first table are added in the order they appear in the first table, finally the non-shared columns of the second table are added in the order they appear in the second table.
The type of each shared column of the joined table is based on the type of the columns in the original tables. If the original types are not exactly the same, the type of the shared column is formed by type aggregation. Type aggregations selects a type that can represent values of both aggregated types. Simple type aggregation picks one of the types. For example, SMALLINT and INTEGER, results in INTEGER, or VARCHAR(10) and VARCHAR(20) results in VARCHAR(20). More complex type aggregation inherits properties from both types. For example DECIMAL(8) and DECIMAL (6,2) results in DECIMAL (8,2).
In the examples below, the rows are joined exactly the same way, but the first query contains a.col_two and b.col_two together with all the rest of the columns of both tables, while the second query returns only one copy of col_two.
SELECT * FROM a INNER JOIN b ON a.col_two = b.col_two SELECT * FROM a INNER JOIN b USING (col_two)
OUTER JOIN
LEFT, RIGHT and FULL OUTER JOIN
The three qualifiers can be added to all types of JOIN except
      CROSS JOIN and UNION JOIN. First the new table is populated with the
      rows from the original join. If LEFT is specified, all the rows from the
      first table that did not make it into the new table are extended to the
      right with nulls and added to the table. If RIGHT is specified, all the
      rows from the second table that did not make it into the new table are
      extended to the left with nulls and added to the table. If FULL is
      specified, the addition of leftover rows is performed from both the
      first and the second table. These forms are expressed by prefixing the
      join specification with the given keyword. For example, A LEFT
      OUTER JOIN B ON (<search condition>) or A
      NATURAL FULL OUTER JOIN B or A FULL OUTER JOIN B
      USING (<comma separated column name list>).
SELECT a.*, b.* FROM a LEFT OUTER JOIN b ON a.col_one = b.col_two
Despite the name, selection has nothing to do with the list of columns in a SELECT statement. In fact, it refers to the search condition used to limit the rows that from a result table (selection of rows, not columns). In SQL, simple selection is expressed with a WHERE condition appended to a single table or a joined table. In some cases, this method of selection is the only method available; for example in DELETE and UPDATE statements. But when it is possible to perform the selection with join conditions, this is the better method, as it results in a clearer expression of the query.
Projection is selection of the columns from a simple or joined table to form a result table. Explicit projection is performed in the SELECT statement by specifying the select column list. Some form of projection is also performed in JOIN ... USING and NATURAL JOIN.
The joined table has columns that are formed according to the rules mentioned above. But in many cases, not all the columns are necessary for the intended operation. If the statement is in the form, SELECT * FROM <joined table>, then all the columns of <joined table> are returned. But normally, the columns to be returned are specified after the SELECT keyword, separated from each other with commas.
In the select list, it is possible to use expressions that reference any columns of <joined table>. Each of these expressions forms a computed column. It is computed for each row of the result table, using the values of the columns of the <joined table> for that row.
Naming is used to hide the original names of tables or table columns and to replace them with new names in the scope of the query. Naming is also used for defining names for computed columns.
Without explicit naming, the name of a column is a predefined name. If the column is a column of a table, or is a named parameter, the name is of the table column or parameter is used. Otherwise it is generated by the database engine. HyperSQL generates column names such as C1, C2, etc. As generated naming is implementation defined according to the SQL Standard, it is better to explicitly name the computed and derived columns in your applications.
Naming in Joined Table
Naming is performed by adding a new name after a table's real name
      and by adding a list of column names after the new table name. Both
      table naming and column naming are optional, but table naming is
      required for column naming. The expression A [AS] X (<comma
      separated column name list>) means table A is used in the
      query expression as table X and its columns are named as in the given
      list. The original name A, or its original column names, are not visible
      in the scope of the query. The BNF is given below. The
      <correlation name> can be the same or different
      from the name of the table. The <derived column
      list> is a comma separated list of column names. The degree
      of this list must be equal to the degree of the table. The column names
      in the list must be distinct. They can be the same or different from the
      names of the table's columns.
<table or query name> [ [ AS ] <correlation
      name> [ <left paren> <derived column list> <right
      paren> ] ]
In the examples below, the columns of the original tables are named (a, b, c, d, e, f). The two queries are equivalent. In the second query, the table and its columns are renamed and the new names are used in the WHERE clauses:
CREATE TABLE atable (a INT, b INT, c INT, d INT, e INT, f INT); SELECT d, e, f FROM atable WHERE a + b = c SELECT x, y, z FROM atable AS t (u, v, w, x, y, z) WHERE u + v = w
Naming in Select List
Naming in the SELECT list logically takes place after naming in the joined table. The new names for columns are not visible in the immediate query expression or query expression. They become visible in the ORDER BY clause and in the result table that is returned to the user. Or if the query expression is used as a derived table in an enclosing query expression.
In the example below, the query is on the same table but with column renaming in the Select list. The new names are used in the ORDER BY clause:
SELECT x + y AS xysum, y + z AS yzsum FROM atable AS t (u, v, w, x, y, z) WHERE u + v = w ORDER BY xysum, yzsum
If the names xysum or yzsum
      are not used, the computed columns cannot be referenced in the ORDER BY
      list.
Name Resolution
In a joined table, if a column name appears in tables on both sides then any reference to the name must use the table name in order to specify which table is being referred to.
Grouping Operations
Grouping results in the elimination of duplicate rows. A grouping operation is performed after the operations discussed above. A simple form of grouping is performed by the use of DISTINCT after SELECT. This eliminates all the duplicate rows (rows that have the same value in each of their columns when compared to another row). The other form of grouping is performed with the GROUP BY clause. This form is usually used together with aggregation.
GROUP BY
<group by clause> ::= GROUP BY [ <set
      quantifier> ] <grouping element> [ { <comma> <grouping
      element> }... ]
<grouping element> ::= <ordinary grouping
      set> | <rollup list> | <cube list> | <grouping sets
      specification> | <empty grouping set>
<ordinary grouping set> ::= <grouping column
      reference> | <left paren> <grouping column reference
      list> <right paren>
<grouping column reference list> ::=
      <grouping column reference> [ { <comma> <grouping column
      reference> }... ]
<grouping column reference> ::= <column
      reference> [ <collate clause> ] 
<rollup list> ::= ROLLUP <left paren>
      <ordinary grouping set list> <right
      paren>
<ordinary grouping set list> ::= <ordinary
      grouping set> [ { <comma> <ordinary grouping set> }... ]
      
<cube list> ::= CUBE <left paren>
      <ordinary grouping set list> <right paren>
      
<grouping sets specification> ::= GROUPING SETS
      <left paren> <grouping set list> <right paren>
      
<grouping set list> ::= <grouping set> [ {
      <comma> <grouping set> }... ] 
<grouping set> ::= <ordinary grouping set>
      | <rollup list> | <cube list> | <grouping sets
      specification> | <empty grouping set> 
<empty grouping set> ::= <left paren>
      <right paren>
An ordinary <group by clause> is a comma
      separated list of columns of the table formed by the <from
      clause> or expressions based on the columns. This is the
      most common usage and can be described as GROUP BY <column
      reference> [ { <comma> <grouping column reference> }...
      ].
When a <group by clause> is used, only
      the columns used in the <group by clause> or
      expressions used there, can be used in the <select
      list>, together with any <aggregate
      function> on other columns. In other words, the column
      names or expressions listed in the GROUP BY clause dictate what can be
      used in the <select list>. After the rows of
      the table formed by the <from clause> and the
      <where clause> are finalised, the grouping
      operation groups together the rows that have the same values in the
      columns of the <group by clause>. Then any
      <aggregate function> in the <select
      list> is performed on each group, and for each group, a row
      is formed that contains the values of the columns of the
      <group by clause> and the values returned from
      each <aggregate function>.
When the type of <column reference> is
      character string, the <collate clause> can be
      used to specify the collation used for grouping the rows. For example, a
      collation that is not case sensitive can be used, or a collation for a
      different language than the original collation of the column.
The first example below shows a simple GROUP BY, while in the second example, has a HAVING condition.
CREATE TABLE REVENUE(CHANNEL VARCHAR(20), YEAR INTEGER, COUNTRY VARCHAR(2), PROVINCE VARCHAR(20), SALES INTEGER); SELECT CHANNEL, YEAR, COUNTRY, SUM(SALES) FROM REVENUE GROUP BY CHANNEL, YEAR, COUNTRY; SELECT CHANNEL, YEAR, COUNTRY, SUM(SALES) FROM REVENUE GROUP BY CHANNEL, YEAR, COUNTRY HAVING SUM(SALES) > 50000;
An extended <group by clause> may
      comprise elements such as GROUPING SETS, ROLLUP, CUBE and the empty
      grouping set. These syntax elements are expanded and then simplified
      into a list of parenthesized column elements, which result in multiple
      grouping operations. HyperSQL version 2.5.1 supports all the syntax
      listed above. The example below uses ROLLUP for grouping.
SELECT CHANNEL, YEAR, COUNTRY, SUM(SALES) AS S FROM REVENUE GROUP BY ROLLUP(CHANNEL, YEAR, COUNTRY);
The ROLLUP is translated into 4 groupings: (channel, year, country), (channel, year), (channel), (). The result set will contain the rows as grouped by (channel, year, country), then rows as grouped by (channel, year) with the country column replaced by null, then rows as grouped by (channel) only, with year and country columns replaced by null, then a single row representing the () empty group with all three columns replaced by null.
CHANNEL YEAR COUNTRY S ------------ ------ ------- ------- INTERNET 2009 GB 25000 INTERNET 2009 US 275000 INTERNET 2010 GB 45000 INTERNET 2010 US 500000 DIRECT SALES 2009 GB 162000 DIRECT SALES 2009 US 1602500 DIRECT SALES 2010 GB 181000 DIRECT SALES 2010 US 1833000 INTERNET 2009 (null) 300000 INTERNET 2010 (null) 545000 DIRECT SALES 2009 (null) 1764500 DIRECT SALES 2010 (null) 2014000 INTERNET (null) (null) 845000 DIRECT SALES (null) (null) 3778500 (null) (null) (null) 4623500
If CUBE is used instead of ROLLUP, other permutations of the three columns are added to those produced by ROLLUP. These include (channel, country), (year, country), (year) and (country).
The optional <set quantifier> is either ALL or DISTINCT and defaults to ALL. When GROUPING SETS is used and multiple sets are specified and some duplicate groups are created as a result, use of DISTINCT eliminates the duplicate groups.
Note any ordering of the rows returned by GROUP BY in incidental. You need to use ORDER BY for the ordering you require.
HAVING
A <having clause> filters the rows of the
      table that is formed after applying the <group by
      clause> using its search condition. The search condition
      must be an expression based on the expressions in the GROUP BY list or
      the aggregate functions used.
DISTINCT
SELECT DISTINCT
When the keyword DISTINCT is used after SELECT, it works as a
      shortcut replacement for a simple GROUP BY clause. The expressions in
      the SELECT list are used directly as the <group by
      clause>. The following examples of SELECT DISTINCT and
      SELECT with GROUP BY are equivalent.
SELECT DISTINCT d, e + f FROM atable WHERE a + b = c SELECT d, e + f FROM atable WHERE a + b = c GROUP BY d, e + f
Aggregation is an operation that computes a single value from the values of a column over several rows. The operation is performed with an aggregate function. The simplest form of aggregation is counting, performed by the COUNT function.
Other common aggregate functions return the maximum, minimum and average value among the values in different rows. Aggregate functions were discussed earlier in this chapter.
Set and Multiset Operations
While join operations generally result in laterally expanded tables, SET and COLLECTION operations are performed on two tables that have the same degree and result in a table of the same degree. The SET operations are UNION, INTERSECT and EXCEPT (difference). When each of these operations is performed on two tables, the collection of rows in each table and in the result is reduced to a set of rows, by eliminating duplicates. The set operations are then performed on the two tables, resulting in the new table which itself is a set of rows. Collection operations are similar but the tables are not reduced to sets before or after the operation and the result is not necessarily a set, but a collection of rows.
The set operations on two tables A and B are: A UNION
      [DISTINCT] B, A INTERSECT [DISTINCT] B and
      A EXCEPT [DISTINCT] B. The result table is formed in
      the following way: The UNION operation adds all the rows from A and B
      into the new table, but avoids copying duplicate rows. The INTERSECT
      operation copies only those rows from each table that also exist in the
      other table, but avoids copying duplicate rows. The EXCEPT operation
      copies those rows from the first table which do not exist in the second
      table, but avoids copying duplicate rows.
The collection operations are similar to the set operations, but
      can return duplicate rows. They are: A UNION ALL B,
      A INTERSECT ALL B and A EXCEPT ALL
      B. The UNION ALL operation adds all the rows from A and B into
      the new table. The INTERSECT operation copies only those rows from each
      table that also exist in the other table. If n copies of a rows exists
      in one table, and m copies in the other table, the number of copies in
      the result table is the smaller of n and m. The EXCEPT operation copies
      those rows from the first table which do not exist in the second table.
      If n copies of a row exist in the first table and m copies in the second
      table the number of copies in the result table is n-m, or if n < m,
      then zero.
The optional WITH clause can be used in a query expression. The
      WITH clause lists one or more named ephemeral tables that can be
      referenced in the query expression body. The ephemeral tables are
      created and populated before the rest of the query expression is
      executed. Their contents do not change during the execution of the
      <query expression body>.
<with clause> ::= WITH [ RECURSIVE ] <with
      list>
<with list> ::= <with list element> [ {
      <comma> <with list element> }... ] 
<with list element> ::= <query name> [
      <left paren> <with column list> <right paren> ] AS
      <left paren> <query expression> <right paren>
      
<with column list> ::= <column name
      list>
An example of the use of the WITH clause is given above under
      UNNEST. The <query expression> in the WITH
      clause is evaluated once and its result table can be referenced in the
      body of the main <query expression body> using
      the specified <query name>.
The RECURSIVE keyword changes the way the elements of the
      <with list> are interpreted. The
      <query expression> contained in the
      <with list element> must be the UNION or UNION
      ALL of two <query expression body> elements (simple VALUES or
      SELECT statements). The left element of the UNION is evaluated first and
      its result becomes the result of the <with list
      element>. After this step, the current result of the
      <with list element> is referenced in the right element (a SELECT
      statement) of the UNION, the UNION is performed between the result and
      previous result of the <with list element>,
      which is enlarged by this operation. The UNION operation is performed
      again and again, until the result of the <with list
      element> stops changing. The result of the
      <with list element> is now complete and is
      later used in the execution of the <query expression
      body>. When RECURSIVE is used, the <with column
      list> must be defined.
HyperSQL limits recursion to 265 rounds. If this is exceeded, an error is raised.
A trivial example of a recursive query is given below. Note the first column GEN. For example, if each row of the table represents a member of a family of dogs, together with its parent, the first column of the result indicates the calculated generation of each dog, ranging from first to fourth generation.
CREATE TABLE pptree (pid INT, id INT); INSERT INTO pptree VALUES (NULL, 1) ,(1,2), (1,3),(2,4),(4,5),(3,6),(3,7); WITH RECURSIVE tree (gen, par, child) AS ( VALUES(1, CAST(null as int), 1) UNION SELECT gen + 1, pid, id FROM pptree, tree WHERE pid = child ) SELECT * FROM tree; GEN PAR CHILD --- ------ ----- 1 (null) 1 2 1 2 2 1 3 3 2 4 3 3 6 3 3 7 4 4 5
if recursive queries become complex, they also become very difficult to develop and debug. HyperSQL provides an alternative to this with user-defined SQL functions which return tables. Functions can perform any complex, repetitive task with better control, using loops, variables and, if necessary, recursion.
A query expression consists of an optional WITH clause and a query expression body. The optional WITH clause lists one or more named ephemeral tables that can be referenced, just like the database tables in the query expression body.
<query expression> ::= [ <with clause> ]
      <query expression body>
A query expression body refers to a table formed by using UNION and other set operations. The query expression body is evaluated from left to right and the INTERSECT operator has precedence over the UNION and EXCEPT operators. A simplified BNF is given below:
<query expression body> ::= <query term> |
      <query expression body> UNION | EXCEPT [ ALL | DISTINCT ] [
      <corresponding spec> ] <query term>
<query term> ::= <query primary> |
      <query term> INTERSECT [ ALL | DISTINCT ] [ <corresponding
      spec> ] <query term>
<query primary> ::= <simple table> |
      <left paren> <query expression body> [ <order by
      clause> ] [ <result offset clause> ] [ <fetch first
      clause> ] <right paren>
<simple table> ::= <query specification> |
      <table value constructor> | <explicit table> <explicit
      table> ::= TABLE <table or query name>
<corresponding spec> ::= CORRESPONDING [ BY
      <left paren> <column name list> <right paren>
      ]
A <query term> and a <query
      primary> can be a SELECT statement, an
      <explicit table>, or a <table value
      constructor>.
The CORRESPONDING clause is optional. If it is not specified, then
      the <query term> and the <query
      primary> must have the same number of columns. If
      CORRESPONDING is specified, the two sides need not have the same number
      of columns. If no column list is used with CORRESPONDING, then all the
      column names that are common in the tables on two sides are used in the
      order in which they appear in the first table. If a columns list is
      used, it allows you to select only some columns of the tables on the
      left and right side to create the new table. In the example below the
      columns named u and v from the two SELECT statements are used to create
      the UNION table.
SELECT * FROM atable UNION CORRESPONDING BY (u, v) SELECT * FROM anothertable
The type of each column of the query expression is determined by combining the types of the corresponding columns from the two participating tables.
When the rows of the result table have been formed, it is possible to specify the order in which they are returned to the user. The ORDER BY clause is used to specify the columns used for ordering, and whether ascending or descending ordering is used. It can also specify whether NULL values are returned first or last.
SELECT x + y AS xysum, y + z AS yzsum FROM atable AS t (u, v, w, x, y, z) WHERE u + v = w ORDER BY xysum NULLS LAST, yzsum NULLS FIRST
The ORDER BY clause specifies one or more <value
      expressions>. The list of rows is sorted according to the
      first <value expression>. When some rows are
      sorted equal then they are sorted according to the next
      <value expression> and so on.
<order by clause> ::= ORDER BY <sort
      specification> [ { <comma> <sort specification> }...
      ]
<sort specification> ::= <value expression> [
      <collate clause> ] [ ASC | DESC ] [ NULLS FIRST | NULLS LAST
      ]
The defaults are ASC and NULLS FIRST. Two database properties SQL NULLS FIRST and SQL NULLS ORDER can be modified to change the default behaviour.
A collation is used for columns of an ORDER BY expression that are
      of the type CHAR or VARCHAR. If a <collate
      clause> is not specified then the collation of the column,
      or the default collation of the database is used.
The default collation for a database is ASCII, with lowercase letters sorted after all uppercase letters. The example below shows the effect of collation on an ordered list.
-- default collation collation for the database is ASCII SELECT id, lastname FROM customer ORDER BY lastname ID LASTNAME -- -------- 40 Clancy 36 King 35 White 6 king -- a language collation is used, it treats king and King as adjacent entries SELECT id, lastname FROM customer ORDER BY lastname COLLATE "English" ID LASTNAME -- -------- 40 Clancy 6 king 36 King 35 White
In the above example, if the LASTNAME column is
      itself defined in the table definition with COLLATE
      "English", then the COLLATE clause is not necessary in the
      ORDER BY expression.
An ORDER BY operation can sometimes be optimised by the engine when it can use the same index for accessing the table data and ordering. Optimisation can happen both with DESC + NULLS LAST and ASC + NULLS FIRST.
sort specification list
sort specification list
<sort specification list> ::= <value
      expression> [ASC | DESC] [NULLS FIRST | NULLS
      LAST]
Specify a sort order. A sort operation is performed on the
      result of a <query expression> or
      <query specification> and sorts the result
      according to one or more <value expression>.
      The <value expression> is usually a single
      column of the result, but in some cases it can be a column of the
      <table expression> that is not used in the
      select list. The default is ASC and NULLS FIRST.
A different form of limiting the rows can be performed on the result table after it has been formed according to all the other operations (selection, grouping, ordering etc.). This is specified by the FETCH ... ROWS and OFFSET clauses of a SELECT statement. In this form, the specified OFFSET rows are removed from start of the table, then up to the specified FETCH rows are kept and the rest of the rows are discarded.
<result offset clause> ::= OFFSET <offset row
      count> { ROW | ROWS }
<fetch first clause> ::= FETCH { FIRST | NEXT } [
      <fetch first row count> ] { ROW | ROWS } ONLY [ USING INDEX
      ]
<limit clause> ::= LIMIT <fetch first row
      count> [ USING INDEX ]
A slicing operation takes the result set that has been already processed and ordered. It then discards the specified number of rows from the start of the result set and returns the specified number of rows after the discarded rows. The <offset row count> and <fetch first row count> can be constants, dynamic variables, routine parameters, or routine variables. The type of the constants must be INTEGER.
SELECT a, b FROM atable WHERE d < 5 ORDER BY absum OFFSET 3 FETCH 2 ROWS ONLY SELECT a, b FROM atable WHERE d < 5 ORDER BY absum OFFSET 3 LIMIT 2 /* alternative keyword */
When the FETCH keyword is used, the specified number of rows must be at least 1, otherwise an error is returned. This behaviour is consistent with the SQL Standard. When the LIMIT keyword is used, the specified number of rows can be zero, which means return all rows (no LIMIT). In MySQL and PostgreSQL syntax modes, zero limit means no rows (empty result).
If there is an index on all the columns specified in the ORDER BY
      clause, it is normally used for slicing. In some queries, an index on
      another column may take precedence because it is used to process the
      WHERE condition. It is possible to add USING INDEX to
      the end of the slicing clause to force the use of the index for ordering
      and slicing, instead of the index for the WHERE condition.
A query expression, for example a SELECT statement, uses indexes for efficient data retrieval. The EXPLAIN PLAN statement lists the indexes used, together with other useful information about the query.
EXPLAIN PLAN
explain plan
<explain plan> ::= EXPLAIN PLAN FOR <query
      expression>
For example, EXPLAIN PLAN FOR SELECT * FROM REVENUE
      WHERE COUNTRY = 'UK' .
DELETE FROM
delete statement: searched
<delete statement: searched> ::= DELETE FROM
      <target table> [ [ AS ] <correlation name> ] [ WHERE
      <search condition> ][ LIMIT <fetch first row count>
      ]
Delete rows of a table. The search condition is a
      <boolean value expression> that is evaluated
      for each row of the table. If the condition is true, the row is deleted.
      If the condition is not specified, all the rows of the table are
      deleted. In fact, an implicit SELECT is performed in the form of
      SELECT * FROM <target table> [ WHERE <search
      condition>] and the selected rows are deleted. When used in
      JDBC, the number of rows returned by the implicit SELECT is returned as
      the update count.
If there are FOREIGN KEY constraints on other tables that reference the subject table, and the FOREIGN KEY constraints have referential actions, then rows from those other tables that reference the deleted rows are either deleted, or updated, according to the specified referential actions.
The LIMIT clause, or alternatively the ROWNUM() function in the WHERE clause, can be used to limit the number of rows that are deleted. This is useful when a very large number of rows needs to be deleted. In this situation, you can perform the operation is chunks and commit after each chunk to reduce memory usage and the total time of the operation.
In the second example below the rows that have the maximum value for column A are deleted;
DELETE FROM T WHERE C > 5 DELETE FROM T AS TT WHERE TT.A = (SELECT MAX(A) FROM T)
TRUNCATE TABLE
truncate table statement
<truncate table statement> ::= TRUNCATE TABLE
      <target table> [ <identity column restart option> ] [
      <truncate options> ]
<identity column restart option> ::= CONTINUE
      IDENTITY | RESTART IDENTITY
<truncate options> ::= AND COMMIT [ NO CHECK
      ]
<truncate table versioning statement> ::=
      TRUNCATE TABLE <target table> VERSIONING TO { TIMESTAMP'YYYY-MM-DD
      HH:MM:SS' | CURRENT_TIMESTAMP }
Delete all rows of a table without firing its triggers. This
      statement can only be used on base tables (not views). If the table is
      referenced in a FOREIGN KEY constraint defined on another table, the
      statement causes an exception. Triggers defined on the table are not
      executed with this statement. The default for <identity
      column restart option> is CONTINUE
      IDENTITY. This means no change to the IDENTITY sequence of the
      table. If RESTART IDENTITY is specified, then the
      sequence is reset to its start value.
TRUNCATE is faster than ordinary DELETE. The TRUNCATE statement is an SQL Standard data change statement; therefore it is performed under transaction control and can be rolled back if the connection is not in the auto-commit mode.
HyperSQL also supports the optional AND COMMIT and NO CHECK options. If AND COMMIT is used, then the transaction is committed with the execution of the TRUNCATE statement. The action cannot be rolled back. If the additional NO CHECK option is also specified, then the TRUNCATE statement is executed even if the table is referenced in a FOREIGN KEY constraint defined on another, non-empty table. This form of TRUNCATE is faster than the default form and does not use much memory.
The <truncate table versioning
      statement> is for removing old history rows from a
      system-versioned table. All history rows that expired before the given
      timestamp are removed. No current row is removed.
TRUNCATE SCHEMA
truncate schema statement
<truncate schema statement> ::= TRUNCATE SCHEMA
      <target schema> [ <identity column restart option> ] AND
      COMMIT [ NO CHECK ]
Performs the equivalent of a TRUNCATE TABLE ... AND COMMIT on all the table in the schema. If the additional NO CHECK option is also specified, then the TRUNCATE statement is executed even if any of the tables in the schema is referenced in a FOREIGN KEY constraint defined on a non-empty table in a different schema.
If RESTART IDENTITY is specified, all table IDENTITY sequences and all SEQUENCE objects in the schema are reset to their start values.
Use of this statement requires schema ownership or administrative privileges.
INSERT INTO
insert statement
<insert statement> ::= INSERT INTO <target
      table> [ [ AS ] <correlation name> ] <insert columns and
      source>
<insert columns and source> ::= <from
      subquery> | <from constructor> | <from
      default>
<from subquery> ::= [ <left paren>
      <insert column list> <right paren> ] [ <override
      clause> ] <query expression>
<from constructor> ::= [ <left paren>
      <insert column list> <right paren> ] [ <override
      clause> ] <contextually typed table value
      constructor>
<override clause> ::= OVERRIDING USER VALUE |
      OVERRIDING SYSTEM VALUE
<from default> ::= DEFAULT
      VALUES
<insert column list> ::= <column name
      list>
Insert new rows in a table. An INSERT statement inserts one or more rows into the table.
The special form, INSERT INTO <target table>
      DEFAULT VALUES can be used with tables which have a default
      value for each column.
With the other forms of INSERT, the optional
      (<insert column list>) specifies to which
      columns of the table the new values are assigned.
In one form, the inserted values are from a <query
      expression> and all the rows that are returned by the
      <query expression> are inserted into the table.
      If the <query expression> returns no rows,
      nothing is inserted.
In the other form, a comma separated list of values called
      <contextually typed table value constructor> is
      used to insert one or more rows into the table. This list is
      contextually typed, because the keywords NULL and DEFAULT can be used
      for the values that are assigned to each column of the table. In this
      form, the keyword DEFAULT means the default value of the column and can
      be used only if the target column has a default value or is an IDENTITY
      or GENERATED column of the table.
The <override clause> must be used
      when a value is explicitly assigned to a column that has been defined as
      GENERATED ALWAYS AS IDENTITY. The OVERRIDING SYSTEM VALUE clause must be
      used to override the sequence value with the user-supplied values. For
      columns defined as GENERATED BY DEFAULT AS IDENTITY, there is no need to
      use OVERRIDING when the user provides values to be used for the insert.
      The OVERRIDING USER VALUE clause can be used with all types of GENERATED
      columns and means the values provided by the user are simply ignored and
      new values generated by the system are used instead. Two examples of
      table definition are given below.
CREATE TABLE t1 (id INTEGER GENERATED ALWAYS AS IDENTITY(START WITH 100), name VARCHAR(20) PRIMARY KEY) CREATE TABLE t2 (id INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 1) PRIMARY KEY, name VARCHAR(20))
In both examples below, the value inserted for the id column is 14. In the first example, it is necessary to use OVERRIDING SYSTEM VALUE when inserting into the id column of table t1 because the column has been defined as GENERATED ALWAYS. In the second example, no OVERRIDING clause is required for the insert into table t2 as its id column is defined as GENERATED BY DEFAULT. In both examples, if there is an existing row with that value as primary key, a constraint violation exception is thrown.
INSERT INTO t1 (id, name) OVERRIDING SYSTEM VALUE VALUES ( 14, 'Test Value') INSERT INTO t2 (id, name) VALUES ( 14, 'Test Value')
In the examples below, OVERRIDING USER VALUE is used to let the system generate values for the id column. The generated values override the value 14 in the first example, and the existing values for the id column in the table in the second example.
INSERT INTO t1 (id, name) OVERRIDING USER VALUE VALUES ( 14, 'Another Test Value') INSERT INTO t1 (id, name) OVERRIDING USER VALUE (SELECT * FROM t1)
An array can be inserted into a column of the array type by using literals, by specifying a parameter in a prepared statement or an existing array returned by query expression. The last example below inserts an array.
The rows that are inserted into the table are checked against all the constraints that have been declared on the table. The whole INSERT operation fails if any row fails to inserted due to constraint violation. Examples:
CREATE TABLE T (A INTEGER GENERATED BY DEFAULT AS IDENTITY, B INTEGER DEFAULT 2) INSERT INTO T DEFAULT VALUES /* all columns of T have DEFAULT clauses */ INSERT INTO T (SELECT * FROM Z) /* table Z has the same columns as table T */ INSERT INTO T (A,B) VALUES ((1,2),(3,NULL), (DEFAULT,6)) /* three rows are inserted into table T */ ALTER TABLE T ADD COLUMN D VARCHAR(10) ARRAY /* an ARRAY column is added */ INSERT INTO T VALUES DEFAULT, 3, ARRAY['hot','cold']
If the table contains an IDENTITY column, the value for this
      column for the last row inserted by a session can be retrieved using a
      call to the IDENTITY() function. This call returns the last value
      inserted by the calling session. When the insert statement is executed
      with a JDBC Statement or PreparedStatement method, the
      getGeneratedKeys() method of Statement can be
      used to retrieve not only the IDENTITY column, but also any GENERATED
      computed column, or any other column. The
      getGeneratedKeys() returns a ResultSet with one
      or more columns. This contains one row per inserted row, and can
      therefore return all the generated columns for a multi-row
      insert.
There are three methods of specifying which generated keys should be returned. The first method does not specify the columns of the table. With this method, the returned ResultSet will have a column for each column of the table that is defined as GENERATED ... AS IDENTITY or GENERATED ... AS (<expression>). The two other methods require the user to specify which columns should be returned, either by column indexes, or by column names. With these methods, there is no restriction on which columns of the inserted values to be returned. This is especially useful when some columns have a default clause which is a function, or when there are BEFORE triggers on the table that may provide the inserted value for some of the columns.
In MySQL syntax compatibility mode, HyperSQL supports INSERT IGNORE, REPLACE and ON DUPLICATE KEY UPDATE variations of the INSERT statement.
UPDATE
update statement: searched
<update statement: searched> ::= UPDATE
      <target table> [ [ AS ] <correlation name> ] SET <set
      clause list> [ WHERE <search condition> ][ LIMIT <fetch
      first row count> ]
Update rows of a table. An UPDATE statement selects rows from
      the <target table> using an implicit SELECT
      statement formed in the following manner:
SELECT * FROM <target table> [ [ AS ]
      <correlation name> ] [ WHERE <search condition>
      ]
Then it applies the SET <set clause
      list> expression to each selected row.
If the implicit SELECT returns no rows, no update takes place. When used in JDBC, the number of rows returned by the implicit SELECT is returned as the update count.
If there are FOREIGN KEY constraints on other tables that reference the subject table, and the FOREIGN KEY constraints have referential actions, then rows from those other tables that reference the updated rows are updated, according to the specified referential actions.
The rows that are updated are checked against all the constraints that have been declared on the table. The whole UPDATE operation fails if any row violates any constraint.
The LIMIT clause, or alternatively the ROWNUM() function in the WHERE clause, can be used to limit the number of rows that are updated. This is useful when a very large number of rows needs to be updated. In this situation, you can perform the operation is chunks and commit after each chunk to reduce memory usage and the total time of the operation.
set clause list
set clause list
<set clause list> ::= <set clause> [ {
      <comma> <set clause> }... ]
<set clause> ::= <multiple column
      assignment> | <set target> <equals operator> <update
      source>
<multiple column assignment> ::= <set target
      list> <equals operator> <assigned
      row>
<set target list> ::= <left paren> <set
      target> [ { <comma> <set target> }... ] <right
      paren>
<assigned row> ::= <contextually typed row
      value expression>
<set target> ::= <column
      name>
<update source> ::= <value expression> |
      <contextually typed value specification>
Specify a list of assignments. This is used in UPDATE, MERGE and SET statements to assign values to a scalar or row target.
Apart from setting a whole target to a value, a SET statement can set individual elements of an array to new values. The last example below shows this form of assignment to the array in the column named B.
In the examples given below, UPDATE statements with single and multiple assignments are shown. Note in the third example, a SELECT statement is used to provide the update values for columns A and C, while the update value for column B is given separately. The SELECT statement must return exactly one row . In this example the SELECT statement refers to the existing value for column C in its search condition.
UPDATE T SET A = 5 WHERE ... UPDATE T SET (A, B) = (1, NULL) WHERE ... UPDATE T SET (A, C) = (SELECT X, Y FROM U WHERE Z = C), B = 10 WHERE ... UPDATE T SET A = 3, B[3] = 'warm'
MERGE INTO
merge statement
<merge statement> ::= MERGE INTO <target
      table> [ [ AS ] <merge correlation name> ] USING <table
      reference> ON <search condition> <merge operation
      specification>
<merge correlation name> ::= <correlation
      name>
<merge operation specification> ::= <merge
      when clause>...
<merge when clause> ::= <merge when matched
      clause> | <merge when not matched clause>
<merge when matched clause> ::= WHEN MATCHED [
      AND <search condition> ] THEN <merge update or delete
      specification>
<merge when not matched clause> ::= WHEN NOT
      MATCHED [ AND <search condition> ] THEN <merge insert
      specification>
<merge update specification> ::= UPDATE SET
      <set clause list>
<merge delete specification> ::= DELETE
      
<merge insert specification> ::= INSERT [
      <left paren> <insert column list> <right paren> ] [
      <override clause> ] VALUES <merge insert value
      list>
<merge insert value list> ::= <left paren>
      <merge insert value element> [ { <comma> <merge insert
      value element> }... ] <right paren>
<merge insert value element> ::= <value
      expression> | <contextually typed value
      specification>
Update rows, delete rows or insert new rows into the
      <target table>. The MERGE statement uses a
      second table, specified by <table reference>,
      to determine the rows to be updated or inserted. It is possible to use
      the statement only to update rows, to delete rows or to insert rows, but
      usually both update and insert are specified.
The <search condition> matches each
      row of the <table reference> with each row of
      the <target table>. If the two rows match then
      the UPDATE clause is used to update the matching row of the target
      table. Those rows of <table reference> that
      have no matching rows are then used to insert new rows into the
      <target table>. Therefore, a MERGE statement
      can update or delete between 0 and all the rows of the
      <target table> and can insert between 0 and the
      number of the rows in <table reference> into
      the <target table>. If any row in the
      <target table> matches more than one row in
      <table reference> a cardinality error is
      raised. On the other hand, several rows in the <target
      table> can match a single row in <table
      reference> without any error. The constraints and
      referential actions specified on the database tables are enforced the
      same way as for an update, a delete and an insert statement.
The optional <search condition> in
      each WHEN clause can be used to filter (reduce) the rows for the
      particular action.
The MERGE statement can be used with only the WHEN NOT MATCHED clause as a conditional INSERT statement that inserts a row if no existing rows match a condition.
In the first example below, the table originally contains two
      rows for different furniture. The <table
      reference> is the (VALUES(1, 'conference table'),
      (14, 'sofa'), (5, 'coffee table')) expression, which evaluates
      to a table with 3 rows. When the x value for a row matches an existing
      row, then the existing row is updated. When the x value does not match,
      the row is inserted. Therefore one row of table t is updated from
      'dining table' to 'conference table', and two rows are inserted into
      table t. The second example uses a SELECT statement as the source of the
      values for the MERGE.
In the third example, a new row in inserted into the table only when the primary key for the new row does not exist. This example uses parameters and should be executed as a JDBC PreparedStatement. The parameter is cast as INTEGER because the MERGE statement does not determine the types of values in the USING clause.
In the fourth example, existing rows that match are deleted.
CREATE TABLE t (id INT PRIMARY KEY, description VARCHAR(100)) INSERT INTO t VALUES (1, 'dining table'), (2, 'deck chair') MERGE INTO t USING (VALUES(1, 'conference table'), (14, 'sofa'), (5, 'coffee table')) AS vals(x,y) ON t.id = vals.x WHEN MATCHED THEN UPDATE SET t.description = vals.y WHEN NOT MATCHED THEN INSERT VALUES vals.x, vals.y MERGE INTO t USING (SELECT * FROM tt WHERE acol = 2) AS vals(x,y) ON t.id = vals.x WHEN MATCHED THEN UPDATE SET t.description = vals.y WHEN NOT MATCHED THEN INSERT VALUES vals.x, vals.y MERGE INTO t USING (VALUES(CAST(? AS INT))) AS vals(x) ON t.id = vals.x WHEN NOT MATCHED THEN INSERT VALUES vals.x, ? MERGE INTO t USING (SELECT * FROM tt WHERE acol = 2) AS vals(x,y) ON t.id = vals.x WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT VALUES vals.x, vals.y
HyperSQL supports some SQL statements, expressions, functions, and Java methods that report on the most recently executed statement.
The IDENTITY() function returns the last inserted
    identity value for the current session.
The GET DIAGNOSTICS statement is supported to a
    limited extent. The built-in function DIAGNOSTICS() is
    an alternative. These are normally used in SQL/PSM routines to check the
    result of the last data update operation.
GET DIAGNOSTICS
get diagnostics statement
<get diagnostics statement> ::= GET DIAGNOSTICS
    <simple target value specification> = ROW_COUNT
The <simple target value specification> is
    a session variable, or a routine variable or OUT parameter.
The keyword ROW_COUNT specifies the row count
    returned by the last executed statement. For INSERT, UPDATE, DELETE and
    MERGE statements, this is the number of rows affected by the statement.
    This is the same value as returned by JDBC
    executeUpdate() methods. For all other statements, zero
    is returned.
The value of ROW_COUNT is stored in the specified
    target.
This statement is often used in CREATE PROCEDURE statements.
In future versions, more options will be supported for diagnostics values.
$Revision: 6145 $