This page is concerned with tag structures. More information on working with tags can be found on the Using the Tags Panel page.
Some more complex structures in PDF are composed of multiple tags. Lists, tables, and tables of contents all need particular attention paid to their tags to ensure that their structure is correctly conveyed. All of these objects make use of container tags that form the structure of the element that contain the tags with content. If these are not present, you can add them by right clicking the tag above where you'd like to insert it, choosing New Tag, and selecting the type of tag you'd like to insert.
Lists are the more straightforward of the two. Their structure consists of a root list tag
<L> that contains list items
<LI>. List items contain labels
<Lbl> (the bullet, number, or letter) and the list item body
<LBody> that contains the main content. Indentation levels can be structured as a
<L> nested inside a
<LBody>. Example structures are shown below:
The label is an important piece of content that should be preserved if it is present, as it provides the context of what type of list is being used. In some situations, a list may be implied, but a label may not be present. In this case you can omit the label tag. In most cases, however, the structure should mirror the above. Acrobat will generally throw an error if tags are out of place, but it may not check to see if the body is in a single tag, if the label is separated, or if the list is split across multiple list tags (especially if it spans multiple pages). For these reasons, it's good practice to always check over your lists' tag structure regardless.
Tables in PDFs, particularly in academic contexts, tend to be more complex than those in Word. Luckily, PDF provides the tools to describe these complex structures. For example, regularity is still a major concern with creating an accurate and accessible tagged representation of a table, but merged cells can still be modified to work within that regularity.
Similar to lists, tables require a prescribed structure to be read properly, consisting of a root table tag
<Table> that contains table rows
<TR>, which contain table data
<TD> or table header
<TH> cells. A sample structure of a regular 3x3 table is given below:
Your first goal in addressing a table in PDF is to ensure that the tag structure matches this paradigm. Check that each row has only the cells present in it and not additional ones. If there are empty cells, be sure to add them, as their structural presence is important. Also, tags should be present in the order in which they appear in the table, so as you go down the tags within a
<TR>, it should highlight the cells from left to right.
Note here that both the columns and rows have headers. The scope for each header cell can be set with the Table Editor to label columns, rows, or both, depending on the structure of the table in question.
By default, the header and data cell tags are assumed to fit within a single row and a single column. If you have a table with a more complex structure, you may have fewer cells in one row or column than in another, which will throw a regularity error. To address this, you can change the column or row span with the ColSpan and RowSpan attributes, either by editing them directly as described in the next section, or by editing the cell with the Table Editor.
You can open the Table Editor from the right click menu of the root table tag. This will display an overlay on the document showing the tag structure of the table. Here, you can select cells by clicking on them individually or dragging a rectangle with the mouse, and if you right click on them you can edit their properties or the display settings of the Table Editor.
If your Table Editor display looks broken, it could be either a structural issue in the table or just some problematic background formatting. Luckily, you can still make edits to cells' properties by modifying their attributes directly (see below).
To correct a merged cell, select it in the editor and open its properties. Here you can input how many rows or columns the cell spans. It will count from its position in the tag structure onward, so if you have a table like the one below, the leftmost header cell should be put in the first row and given a ColSpan of 3.
Table Correction Example
If you're running into issues with the Table Editor preventing you from selecting the correct cells to edit their span, you can set this by editing their attributes directly.
- Right click the tag in question and open its properties
- Click on the Edit Attribute Objects… button
- Open any objects that are there and see if any have an
/O /Tableitem. If so, skip to step 7.
- With the highest level object selected (if there is one) click New Item.
- Open the new Attribute Object, select the
/O /Layoutitem, and click Change Item.
- Type "Table". Capitalization matters.
- Add the relevant span attribute
- If you already had a Table object and the relevant attribute is already present, select it and click Change Item. Then enter the correct value.
- Otherwise, select the Object that contains your
/O /Tableitem, click New Item, and enter the relevant information. Use RowSpan or ColSpan (capitalization matters) for your Key; Value is the number of rows or columns it should span, and Value Type should be Integer.
- Click OK out of any open dialog boxes and close the properties. If you right click on your table and reopen the Table Editor, you should see the editor reflect your changes.
In the case that your changes are not reflected (this normally happens when there's strange backend formatting, and your Table Editor will generally look useless), just make all of the necessary adjustments to the table elements such that the tag structure matches the visual structure. If you've done this correctly, when you Check Again in the Accessibility Checker you won't have any errors thrown for your table.
Table of Contents
<TOC> (Table of Contents) and
<TOCI> (Table of Contents Item) tags can be used similar to the list structure to provide the structure for a table of contents, with a
<TOC> item containing
<TOCI> items. The contents of a
<TOCI> tag will generally be
<Reference> tags for the description and page number,
<Lbl> tags if the entries are numbered, and
<NonStruct> tags to contain the leader lines, though these can also be backgrounded. Indented portions can be nested as with Lists. An example structure is provided below:
1. Lists 1 2. Tables 2 a. Table Editor 3 b. Edit Attributes 6 3. Table of Contents 11 4. Article Citation 23
Full Tag Structure
<Reference>Table of Contents
The following articles were used as examples. Modifications to the PDF structure were made for illustrative purposes.
Obeka, C., & Numbere, A. O. (2020). Heavy metal concentration and public health risk in consuming Sardinella maderensis (Sardine), Sarotherodon melanotheron (Tilapia), and Liza falicipinis (Mullet) harvested from Bonny River, Nigeria. Journal of Oceanography and Marine Science, 11(1), 1-10. https://doi.org/10.5897/JOMS2019.0158
Copyright © 2022 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0.