This document has not been updated for Java. The API documentation for Java is complete, but many ReQL articles still have examples in other languages. We’ll be updating each article after the Java driver is officially released.
Don’t see the recipe you’re looking for? Request or add a recipe by opening an issue on GitHub.
You can use the dbCreate
command as follows:
r.dbCreate("blog").run(conn, function(err, result) { if (err) throw err; console.log(result); });
Another way to create a database is through the web UI. You can reach the web UI at http://HOST:8080
. Click on the Tables tab at the top and then use the Add Database button.
The easiest way to rename a database is to use the config command to access the db_config
system table, and then simply use the update
command.
r.db("old_db_name").config().update({name: "new_db_name"}).run(conn, function(err, result) { if (err) throw err; console.log(result); } );
You can select the database where you’d like to create the table with the db
command and use the tableCreate
command as follows:
r.db("blog").tableCreate("posts").run(conn, function(err, result) { if (err) throw err; console.log(result); });
Note that you can omit the db
command if you’re creating a table in the default database on your connection (set to test
unless specified in connect
).
Another way to create a new table is to use the web UI. You can reach the web UI at http://HOST:8080
. Click on the Tables tab at the top of the page and then use the Add Table button.
You can insert documents by calling the insert
command on the appropriate table:
r.table("user").insert({ name: "Michel", age: 26 }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
You can insert multiple documents at once by passing an array of documents to insert
as follows:
r.table("user").insert([ { name: "Michel", age: 26 }, { name: "Slava", age: 30 } ]).run(conn, function(err, result) { if (err) throw err; console.log(result); });
To delete documents, select the documents you’d like to delete and use the delete
command. For example, let’s delete all posts with the author “Michel”:
r.table("posts").filter(r.row("author").eq("Michel")).delete().run(conn, function(err, result) { if (err) throw err; console.log(result); } );
Or, let’s try to delete a single user:
r.table("posts").get("7644aaf2-9928-4231-aa68-4e65e31bf219").delete().run(conn, function(err, result) { if (err) throw err; console.log(result); } );
Here is how we’d delete all documents in a table:
r.table("posts").delete().run(conn, function(err, result) { if (err) throw err; console.log(result); });
To get all documents in a table, simply use the table
command:
r.table("posts").run(conn, function(err, result) { if (err) throw err; console.log(result); });
The table
command returns a cursor; use the next or each command to iterate through the result set, or toArray to retrieve the set as an array.
To get a specific document by ID, use get
:
r.table("posts").get(1).run(conn, function(err, result) { if (err) throw err; console.log(result); });
To retrieve documents by the value of a specific field, use filter
:
r.table("posts").filter({author: "Michel"}).run(conn, function(err, result) { if (err) throw err; console.log(result); });
To retrieve documents by the value of a specific index, use getAll
:
r.table("posts").getAll("review", {index: "category"}).run(conn, function(err, result) { if (err) throw err; console.log(result); });
(For more complex filtering recipes, read on.)
Suppose you’d like to select all posts where the author’s name is “Michel” and the category is “Geek”. You can do it as follows:
r.table("posts").filter({ author: "Michel", category: "Geek", }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Alternatively, you can build a predicate with the and
command, and pass it to filter
:
r.table("posts").filter( r.row("author").eq("Michel").and(r.row("category").eq("Geek")) ).run(conn, function(err, result) { if (err) throw err; console.log(result); });
You can also use the prefix notation (passing all arguments to r.and
), if that’s what you prefer:
r.table("posts").filter( r.and(r.row("author").eq("Michel"), r.row("category").eq("Geek")) ).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Similarly, you can use the r.or
command to filter based on one of many conditions.
Suppose we have a table users
with documents of the following form:
{ name: "William Adama" emails: ["[email protected]", "[email protected]"], ship: "Galactica" }
If we want to retrieve all users that have the email address [email protected]
, we can write:
r.table("user").filter(r.row("emails").contains("[email protected]")) .run(conn, function(err, result) { if (err) throw err; console.log(result); });
If we want to retrieve all users on the Galactica and Pegasus, we can write:
r.table("user").filter(function (user) { r(["Galactica", "Pegasus"]).contains(user("ship")) }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
In JavaScript you can use the operator ()
to get the value of a field. This operator can be chained to retrieve values from nested fields.
Suppose we have a table users
with documents of the following form:
{ name: "William Adama" contact: { phone: "555-5555" email: "[email protected]" } }
Let’s filter based on the nested field email
:
r.table("user").filter( r.row("contact")("email").eq("[email protected]") ).run(conn, function(err, result) { if (err) throw err; console.log(result); });
For many ReQL commands, you can also use a JSON-style nested syntax that allows considerably more flexibility. Read “Accessing nested fields” for more information.
If you want to retrieve all the posts with the primary keys 1
, 2
, or 3
you can use the getAll
command:
r.table("posts").getAll(1, 2, 3).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Suppose we have a table posts
that links posts to authors via an author_id
field. If we’ve created a secondary index on author_id
and want to retrieve all the posts where author_id
is 1
, 2
, or 3
, we can use the getAll
command to do it as follows:
r.table("posts").getAll(1, 2, 3, {index: 'author_id'}) .run(conn, function(err, result) { if (err) throw err; console.log(result); });
Read about creating secondary indexes in RethinkDB.
If you’re using a command that returns a stream and want to retrieve all of its results at once in an array rather than iterating through them with the cursor object, you can coerce it to an array using the toArray command.
r.table('posts').run(conn, function(err, cursor) { if (err) throw err; cursor.toArray(function(result) { console.log(result); }); });
See the data type documentation for more detail about streams.
If you need to retrieve only a few specific fields from your documents, you can use the pluck
command. For example, here is how you’d return only the fields name
and age
from each row in table users
:
r.table("users").pluck("name", "age").run(conn, function(err, result) { if (err) throw err; console.log(result); });
This is equivalent to calling SELECT name, age FROM users
in SQL.
The pluck
command also supports selecting nested fields in a document. For example, suppose we’d like to select the fields phone
and email
from the following document:
{ name: "William Adama" contact: { phone: "555-5555" email: "[email protected]" } }
We can use the following syntax:
r.table("users").pluck( {contact: {phone: true, email: true}} ).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Suppose you want to retrieve all the posts whose date field is between January 1st, 2012 (included) and January 1st, 2013 (excluded). You could do:
r.table("posts").filter(function(post) { return post("date").during(r.time(2012, 1, 1, 'Z'), r.time(2013, 1, 1, 'Z')); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
You can also manually compare dates:
r.table("posts").filter(function(post) { return post("date").ge(r.time(2012, 1, 1, 'Z')).and( post("date").lt(r.time(2013, 1, 1, 'Z'))); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
If you want to retrieve all users whose last name starts with “Ma”, you can use r.match
this way:
// Will return Martin, Martinez, Marshall etc. r.table("users").filter(function(user) { return user("lastName").match("^Ma"); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
If you want to retrieve all users whose last name ends with an “s”, you can use r.match
this way:
// Will return Williams, Jones, Davis etc. r.table("users").filter(function(user) { return user("lastName").match("s$"); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
If you want to retrieve all users whose last name contains “ll”, you can use r.match
this way:
// Will return Williams, Miller, Allen etc. r.table("users").filter(function(user) { return user("lastName").match("ll"); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Retrieve all users whose name is “William” (case insensitive).
// Will return william, William, WILLIAM, wiLLiam etc. r.table("users").filter(function(user) { return user("name").match("(?i)^william$"); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
If you want to perform a query that returns aggregations on different fields together, this is a canonical use case for map-reduce.
Suppose a data set that lists top movies, ranked by user vote. You’d like to get the total votes and the average age of the top 25 movies: the avg()
of the year
column and the sum()
of the votes
column, ordered by the rank
column to get the range 1–25.
To perform this, map the first 25 movies into a new result set, adding a count
column, then reduce each row of the mapped result set into a total for each field (votes
, year
and column
). Then use do to return a result set with the total votes and the average year, computed by dividing the sum of the years by their count.
r.table('movies').orderBy('rank').limit(25).map(function (doc) { return { totalVotes: doc('votes'), totalYear: doc('year'), count: 1 }; }).reduce(function (left, right) { return { totalVotes: left('totalVotes').add(right('totalVotes')), totalYear: left('totalYear').add(right('totalYear')), count: left('count').add(right('count')) }; }).do(function (res) { return { totalVotes: res('totalVotes'), averageYear: res('totalYear').div(res('count')) }; }).run(conn, callback);
We’re working on an easier syntax for performing multiple aggregations after group
commands. Follow issue 1725 to track progress on this.
To add or overwrite a field, you can use the update
command. For instance, if you would like to add the field author
with the value “Michel” for all of the documents in the table posts
, you would use:
r.table("posts").update({ author: "Michel" }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
The update
command lets you to overwrite fields, but not delete them. If you want to delete a field, use the replace
command. The replace
command replaces your entire document with the new document you pass as an argument. For example, if you want to delete the field author
of the blog post with the id 1
, you would use:
r.table("posts").get("1").replace(r.row.without('author')) .run(conn, function(err, result) { if (err) throw err; console.log(result); });
All modifications made via the update
and replace
commands are always atomic with respect to a single document. For example, let’s say we’d like to atomically update a view count for a page if the field countable
is set to true, and get back the old and new results in a single query. We can perform this operation as follows:
r.table("pages").update(function(page) { return r.branch(page("countable").eq(true), // if the page is countable { views: page("views").add(1) }, // increment the view count {} // else do nothing ); }, {returnChanges: true}).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Using a similar technique to the last recipe, we can use branch
and replace
to maintain a document’s updated_at
and created_at
fields by either inserting a new document or updating one depending on whether a document with a specified ID exists.
r.table('users').get(id).replace(function (doc) { return r.branch( doc.eq(null), r.expr(userObject).merge({id: id, created_at: r.now()}), doc.merge(userObject).merge({updated_at: r.now()}) ) }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
You can use the epochTime
and ISO8601
commands to convert Unix timestamps (in seconds) and JSON date-time strings (which are in ISO 8601 format) to the ReQL time type. The ReQL driver will also convert JavaScript Date objects into ReQL time.
var theDate = new Date(); var timestamp = theDate.getTime(); var JSONDate = theDate.toJSON(); r.table("dates").insert({ from_object: theDate, from_epoch: r.epochTime(timestamp/1000.0), from_iso: r.ISO8601(JSONDate) }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Use the commands toEpochTime
and toISO8601
to convert back.
It’s possible to increment a field value in a document—for example, a counter—in one step on the server.
r.table('aggregated').get(id).update( { count: r.row('count').default(0).add(1) } ).run(conn, callback);
Use default
to ensure that if the count
field doesn’t already exist in the document, it’s added correctly, rather than letting add
throw an error.
You can limit the number of documents returned by your queries with the limit
command. Let’s retrieve just the first 10 blog posts:
r.table("posts").orderBy("date").limit(10).run(conn, function(err, result) { if (err) throw err; console.log(result); });
There are multiple ways to paginate results in RethinkDB. The most straightforward way is using skip
and limit
(similar to the way SQL’s OFFSET
and LIMIT
work), but that’s also the least efficient. It’s more efficient to use slice
, and even more efficient to use between
with a secondary index.
The slice command returns a range from a given start value through but not including a given end value. This makes it easy to use as a skip
/limit
replacement: the start value is the first item to retrieve, and the end value is the first item plus the limit. To retrieve posts 11-20 from the database using slice
:
r.table("posts").orderBy("date").slice(11,21).run(conn, function(err, result) { if (err) throw err; console.log(result); });
If you have a secondary index, you can use the between command in conjunction with orderBy and limit
. This is the most efficient way to paginate, but requires looking up values in the secondary index field to find the first record of each page.
Suppose you wanted to paginate through a set of users, 25 at a time. You could get the first 25 records efficiently just with limit
.
r.table("users").orderBy({index: "name"}).limit(25).run(conn, function(err, result) { if (err) throw err; console.log(result); });
For each successive page, start with the last name in the previous page.
r.table("users").between(lastName, r.maxval, {leftBound: "open", index: "name"}) .orderBy({index: "name"}).limit(25).run(conn, function(err, result) { if (err) throw err; console.log(result); });
We pass the lastName
saved from the previous set to between
as the start index. For the end index, we pass null
to return documents from the start index to the table’s end. The leftBound
parameter tells between
not to include the first record, since it was already returned as part of the previous page.
You can count the number of documents with a count
command:
r.table("posts").count().run(conn, function(err, result) { if (err) throw err; console.log(result); });
You can compute the average value of a field with the avg
command.
r.table("posts").avg("num_comments").run(conn, function(err, result) { if (err) throw err; console.log(result); });
Suppose we’d like to to retrieve all the posts in the table post
and also return an additional field, comments
, which is an array of all the comments for the relevant post retrieved from the comments
table. We could do this using a subquery:
r.table("posts").merge(function(post) { return { comments: r.table("comments").filter(function(comment) { return comment("id_post").eq(post("id")) }).coerceTo("ARRAY") } }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Suppose the table marks
stores the marks of every students per course:
[ { "name": "William Adama", "mark": 90, "id": 1, "course": "English" }, { "name": "William Adama", "mark": 70, "id": 2, "course": "Mathematics" }, { "name": "Laura Roslin", "mark": 80, "id": 3, "course": "English" }, { "name": "Laura Roslin", "mark": 80, "id": 4, "course": "Mathematics" } ]
You may be interested in retrieving the results in this format:
[ { "name": "Laura Roslin", "Mathematics": 80, "English": 80 }, { "name": "William Adama", "Mathematics": 70, "English": 90 } ]
In this case, you can do a pivot operation with the group
and coerceTo
commands.
r.db('test').table('marks').group('name').map(function (row) { return [row('course'), row('mark')]; }).ungroup().map(function (res) { return r.expr({name: res('group')}).merge(res('reduction').coerceTo('object')); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Note: A nicer syntax will eventually be added. See the Github issue 838 to track progress.
Doing an unpivot operation to “cancel” a pivot one can be done with the concatMap
, map
and keys
commands:
r.table("pivotedMarks").concatMap(function (doc) { return doc.without("id", "name").keys().map(function (course) { return { name: doc("name"), course: course, mark: doc(course) }; }); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Note: A nicer syntax will eventually be added. See the Github issue 838 to track progress.
Suppose we want to rename the field id
to idUser
when retrieving documents from the table users
. In the subquery, we can use merge
to add a new field with the existing field’s value, then without
to delete the old field:
r.table("users").map( r.row.merge({ idUser: r.row("id") }).without("id") ).run(conn, function(err, result) { if (err) throw err; console.log(result); });
ReQL has commands for extracting parts of dates and times, including year, month, day, dayOfWeek and more. You can use these with group to group by various intervals. Suppose you had a table of invoices and wanted to retrieve them in groups ordered by year and month:
r.table("invoices") .group([r.row("date").year(), r.row("date").month()]) .ungroup() .merge({invoices: r.row('reduction'), month: r.row('group')}) .without('reduction', 'group') .orderBy('month') .run(conn, function(err, result) { if (err) throw err; console.log(result); });
(We also use the technique for renaming a field, described above, to give the names “reduction” and “group” more useful names of “invoices” and “month.”) You could use any combination of the ReQL date/time interval commands in the group, or work with the date/time as a native object.
Currently, ReQL has a default limit of 100,000 elements in an array, and the implementation of group
requires the total number of documents grouped to fit within that boundary, so you are limited to 100,000 invoices. This can be changed, however, by passing the arrayLimit
option to run. (Also note that ungroup
always returns an array, although this may change in a future version. Follow issue #2719 for progress on this.)
You can also use this approach with a compound index on the intervals you want to group:
r.table('invoices').indexCreate( 'byDay', [r.row('date').year(), r.row('date').month(), r.row('date').day()] ).run(conn, function(err, result) { if (err) throw err; console.log(result); });
Then you can use that index in the group
function. This query would return the highest-value invoice for each day.
r.table("invoices") .group({index: 'byDay'}) .max('price') .run(conn, function(err, result) { if (err) throw err; console.log(result); });
Efficiently generating monotonically increasing IDs in a distributed system is a surprisingly difficult problem. If an inserted document is missing a primary key, RethinkDB currently generates a random UUID. We will be supporting additional autogeneration schemes in the future (see https://github.com/rethinkdb/rethinkdb/issues/117), but in the meantime, you can use one of the available open-source libraries for distributed id generation (e.g. twitter snowflake).
When you issue a write query (insert
, delete
, update
, or replace
), RethinkDB returns a summary object that looks like this:
{deleted:0, replaced:0, unchanged:0, errors:0, skipped:0, inserted:1}
The most important field of this object is errors
. Generally speaking, if no exceptions are thrown and errors
is 0 then the write did what it was supposed to. (RethinkDB throws an exception when it isn’t even able to access the table; it sets the errors
field if it can access the table but an error occurs during the write. This convention exists so that batched writes don’t abort halfway through when they encounter an error.)
The following fields are always present in this object:
inserted
– Number of new documents added to the database.deleted
– Number of documents deleted from the database.replaced
– Number of documents that were modified.unchanged
– Number of documents that would have been modified, except that the new value was the same as the old value.skipped
– Number of documents that were unmodified in a write operation, because the document is not available to be deleted or updated. The document might have been deleted by a different operation happening concurrently, or in the case of a get
operation the key might not exist.errors
– Number of documents that were left unmodified due to an error.In addition, the following two fields are set as circumstances dictate:
generated_keys
– If you issue an insert query where some or all of the rows lack primary keys, the server will generate primary keys for you and return an array of those keys in this field. (The order of this array will match the order of the rows in your insert query.)first_error
– If errors
is positive, the text of the first error message encountered will be in this field. This is a very useful debugging aid. (We don’t return all of the errors because a single typo can result in millions of errors when operating on a large database.)Sometimes you may want to write a ReQL document with a dynamic key—the field name is stored in a variable. You can do this with the object
command, which takes a list of keys and values ((key, value, key, value ...)
) and returns an object from them.
r.table('users').get(1).update(r.object(propertyName, value)).run(conn, function(err, result) { if (err) throw err; console.log(result); });
The field name can be determined entirely on the server, too. For instance, to update a field whose name is drawn from the value of another field:
r.table('users').forEach(function (doc) { return r.table('users').get(doc('id')).update(r.object(doc('field'), newValue)); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
For a practical example, imagine a data set like the one from the pivot example, where each document represents a student’s course record.
[ { "name": "John", "mark": 70, "id": 1, "course": "Mathematics" }, { "name": "John", "mark": 90, "id": 2, "course": "English" } ]
But you’d like to get a document more like a “report card”:
{ "Mathematics": 70, "English": 90 }
You can accomplish this with object
and a pivot.
r.table("marks").filter({student: "John"}).map(function(mark) { return r.object(mark("course"), mark("mark")); }).reduce(function(left, right) { return left.merge(right); }).run(conn, function(err, result) { if (err) throw err; console.log(result); });
For testing or logging purposes, you might want to capture a created ReQL query as a string. (You can see an example of this in ReQL error messages.) While there is no ReQL command to do this, you can simply use the toString()
method at the end of a query chain, rather than run()
:
r.table('users').filter(r.row('groups').contains('operators')).toString()
It’s a common pattern in some query interfaces to “build” queries programmatically by instantiating a query object, calling it several times in succession to add query commands, then calling the execution command. This lets you dynamically change the query based on conditions at runtime. You might expect to do this in ReQL like so:
var query = r.table('posts'); if (request.filter !== undefined) { query.filter(request.filter); } query.orderBy('date'); query.run(conn, callback);
However, that won’t work! The reason is that the query object doesn’t store state. Each of the commands after the first one is simply running on the original value of query
(in this case, the posts
table). You can solve this by explicitly assigning the output of each new command to the query
variable:
var query = r.table('posts'); if (request.filter !== undefined) { query = query.filter(request.filter); } query = query.orderBy('date'); query.run(conn, callback);
You might want to produce a “union” changefeed to watch multiple tables or queries on just one feed. Since the union
command works with changes
, ReQL makes this fairly straightforward. To monitor two tables at once:
r.table('table1').union(r.table('table2')).changes().run(conn, callback);
You might want to “tag” the tables to make it clear which changes belong to which table.
r.table('table1').merge({table: 'table1'}) .union(r.table('table2').merge({table: 'table2'}) .changes().run(conn, callback);
Also, you can use changes
with each query rather than after the whole.
r.table('table1').filter({flag: 'blue'}).changes() .union(r.table('table2').filter({flag: 'red'}).changes()) .run(conn, callback);
© RethinkDB contributors
Licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.
https://rethinkdb.com/docs/cookbook/java/