Digging into Drupal's Schema
I’m relatively new to Drupal internals and most of the work I do is on the database side. While searching for information on Drupal’s schema, I found very little. During my research, I put together an ER diagram of the schema installed by Drupal 7 (D8 is very similar with only 3 extra tables at time of writing) and decided to share my work. Note that the relationships I discuss here are based on the foreign key documentation that exists in core and my understanding of what I believe other relationships could be. Corrections and comments are very much welcome.
I’ll start off by showing my complete ER diagram below. You will see I grouped tables I found to be related in colored boxes. The image below is just meant to give a general overview of the schema. I will be diving into different parts of the schema in this post. I created this diagram using MySQL Workbench and the model can be downloaded from here if someone wishes to open this up in Workbench. This gist also shows the
ALTER TABLE SQL statements that would need to be issued to actually create these foreign keys in MySQL. I would not recommend doing this right now with Drupal as many things would break.
Without delving into the relationships and details of this diagram, lets first cover some basic details. A stock install of Drupal 7 results in 73 tables being created. 10 of those tables are used for caching purposes:
|cache||caches items not separated out into their own cache tables|
|cache_block||the block modules can cache already built blocks here|
|cache_bootstrap||data required during the bootstrap process can be cached in this table|
|cache_field||stores cached field values|
|cache_filter||caches already filtered pieces of text|
|cache_form||caches recently built forms and their storage data|
|cache_image||caches information about image manipulations that are in progress|
|cache_menu||caches router information as well as generated link trees|
|cache_page||caches compressed pages served to anonymous users|
|cache_path||caches path aliases|
11 tables are created which do not relate to any other tables:
|actions||stores action information|
|batch||stores details about batches (processes that run in multiple HTTP requests)|
|blocked_ips||stores a list of blocked IP addresses|
|flood||controls the threshold of events, such as the number of contact attempts|
|queue||stores items in queues|
|rdf_mapping||stores custom RDF mappings for user-defined content types|
|semaphore||stores semaphores, locks, and flags|
|system||contains a list of all modules, themes, and theme engines that are or have been installed|
|url_alias||contains a list of URL aliases for Drupal paths|
|variable||stores variable/value pairs created by Drupal core or any other module or theme|
The 21 tables listed above are self-explanatory and I’m not going to discuss them any further in this post. They also are independent in that these tables have no relationships with other tables.
Field Related Tables
There are 8 tables installed with core related to fields and field storage:
|field_data_body||stores details about the body field of an entity|
|field_revision_body||stores information about revisions to body fields|
|field_data_comment_body||stores information about comments associated with an entity|
|field_revision_comment_body||stores information about revisions to comments|
|field_data_field_image||stores information about images associated with an entity|
|field_revision_field_image||stores information about revisions to images|
|field_data_field_tags||stores information about tags associated with an entity|
|field_revision_field_tags||stores information about revisions to taxonomy terms/tags associated with an entity|
While I was initially tempted to have these tables related to
node, that would not really be correct since these tables are related to an entity. In D7, entities can be other objects besides nodes, such as users or comments. The
entity_type column in these tables reflects that reality. These tables can be stored in other storage systems such as MongoDB due to the field storage API introduced in Drupal 7.
There are 2 other tables related to fields:
field_config_instance. These tables store field configuration information. I believe a row in
field_config_instance cannot (well at least should not) exist without the correspondong
field_id in the
field_config table. Hence, the one-to-many relationship from
field_config_instance is an identifying relationship.
Small Groups of Tables
There are a number of groups you will notice in the full ER diagram that are made up of 2 to 3 tables. Zooming in on 4 of those groups, we can see those tables more clearly:
One thing you will notice is that some relationships are shown with a solid line whereas others use a dotted line. MySQL Workbench represents identifying relationships with a solid line and non-identifying relationships with a dotted line. If you are unfamiliar with those terms, the standard defintions are:
- identifying relationship - the foreign key attribute is part of the child’s primary key attribute.
- non-identifying relationship - the primary key attributes of the parent must not become primary key attributes of the child.
Now lets discuss those groups in more detail.
I grouped the
registry_file tables together. These tables are used for implementing the code registry in Drupal. A one-to-many relationship exists from
registry and this relationship is an identifying relationship. A
filename should not appear in the
registry table that is not present in the
I grouped the
image_effects tables together. These tables store configuration options for image styles and effects. A one-to-many relationship exists from
image_effects and this relationship is a non-identifying relationship.
There are three tables about date formats in Drupal.
date_format_type is a lookup table that stores configured date format types. After a stock install of Drupal 7, three date format types exist:
A one-to-many relationship exists from this lookup table to both
In practice, this would be problematic. For example, a new date format can be created by an adminstrator. In D7, this results in the
system_date_format_save function being called. This function will insert a row in the
date_formats table that will not have a corresponding type (type will be listed as custom).
You will also notice the
locked column is redundant in the
date_formats table. I submitted a patch to change this.
I grouped the
file_usage tables into 1 group. These tables store information about uploaded files and information for tracking where a file is used.
I believe a 1-to-1 relationship exists from
file_usage and that this is an identifying relationship.
User Related Tables
There are quite a few tables that store user related information. Below is a figure where I zoom in on those tables.
As you can see, the tables directly associated with users are
authmap. These tables are in a one-to-many relationship from
users. The functionality these tables provide is:
|authmap||stores distributed authentication mapping|
|sessions||stores information about a users session|
|watchdog||contains logs of all system events|
There are then two tables that are in a many-to-many relationship with
users that link this table with other groups. One of these is the
users_roles table. This table links
role table is then in a one-to-many relationship with the
role_permission table. The other many-to-many table is
shortcut_set_users. This table links
The tables for the menu system are not really related to users but I placed the group close by since the
menu_links table maintains a one-to-many relationship with the
shortcut_set table. While the tables for the menu system do not appear to be related, I do believe a relationship exists there. In particular, I think that the
menu_link table has relationships to both the
menu_custom tables. The
router_path column in
menu_links could reference the
router column in
menu_router and the
menu_name column in
menu_links could reference the
menu_name in the
menu_custom table. Right now however, after a stock install of D7, a row with a menu name that is not present in
menu_custom will be created in
The menu system tables and a description of what they do is below.
|menu_custom||holds definitions for top-level custom menus|
|menu_links||contains the individual links within a menu|
|menu_router||maps paths to various callbacks|
Node Related Tables
Node is one of the most central concepts in Drupal so as you can imagine, many tables are related to that concept. First off, a high level overview of the tables related to the
node table are shown below.
Tables that are directly related to
node_type table is in many-to-many relationship with
node_revision is in a many-to-one relationship with
node as is
node_access table has only 1 row upon initial installation and references a non-existent node. An issue has been created to address this.
The tables directly related to
node and a description of what they do is below.
|node_access||identifies which realm/grant pairs a user must possess in order to view, update, or delete specific nodes|
|node_revision||stores information about each saved version of a node|
|node_type||stores information about all defined node types|
Four tables in the stock schema are related to taxonomy. These tables are shown in the figure below.
First of all, the
taxonomy_index table is in a many-to-many relationship with the
taxonomy_term_data tables. The
taxonomy_vocabulary table has a one-to-many relationship with the
taxonomy_term_data table. The
taxonomy_term_data table in turn has 2 1-to-many relationships with the
A description of the taxonomy tables is given below.
|taxonomy_index||maintains de-normalized information about node/term relationships|
|taxonomy_term_data||stores term information|
|taxonomy_term_hierarchy||stores the hierarchical relationship between terms|
|taxonomy_vocabulary||stores vocabulary information|
The main table in this group is
block. It has three directly related tables in one-to-many relationships:
A description of the blocks tables is given below.
|blocks||stores block settings|
|block_custom||stores the contents of custom-made blocks|
|block_node_type||stores information that sets up display criteria for blocks based on content type|
|block_role||stores access permissions for blocks based on user roles|
The relationships for the search tables I am a little unsure of. I believe they are as shown in the figure below.
The relationship I’m most unsure of here are between
search_index. I don’t think the one-to-many relationship I have in place from
search_index is correct.
A description of the search tables is given below.
|search_dataset||stores items that will be searched|
|search_index||stores the search index and associates words, items, and scores|
|search_node_links||stores items that link to other nodes|
|search_total||stores search totals for words|
Tables That Relate Nodes to Users
There are three tables in many-to-many relationships between
|comment||stores comments and associated data|
|history||stores a record of which users have read which nodes|
|node_comment_statistics||maintains statistics of nodes and comments posts to show new and updated flags|
comment table could be in its own group. I decided against doing that in this ER diagram since I felt like it would have been a table by itself. Logically, I think of it as either being in the
node_comment_statistics does also maintain a relationship with
comment. This is a non-identifying relationship since a node can exist without any comments.
During this work, I noticed that the column definitions for many foreign key relationships are in-correct which would result in MySQL not allowing these constraints to actually be created. I created an issue and patch for this but it turns out Liam Morland is working on using foreign keys in core and also came across this around the same time as me.
Other issues I encountered have also been logged by Liam:
node_accesstable references a non-existent node (relevant issue)
- a set name exists in
shortcut_setthat does not exist in
I would vote for foreign keys being used in Drupal core for a number of reasons, not least of which foreign keys aid a newcomer when trying to understand the schema installed by Drupal.
As I mentioned at the beginning of this post, any comments or corrections are very much welcome. I hope this information can prove useful to someone else besides me!
blog comments powered by Disqus