Library data

Data transformation details

  • We take MARCXML, select particular sub-fields and turn each record into a row in a CSV with the field property as the header.
  • We then pass it through a transform that combines sub-fields and normalizes identifiers for non-URI IDs

Type and classifications

Each record in the MARCXML document is rendered with:

  • A URI derived from MARC field 010, eg: http://data.okeeffemuseum.org/library/11013
  • A type of ManMadeObject

Identifiers and Names

Using our existing base patterns, we can model identifiers from MARC fields as Identifier nodes that are classified according to their field type. For example we can model a library call number as:

{
    "id": "http://data.okeeffemuseum.org/library/11013",
    "type": "ManMadeObject",
    "identified_by": [
        {
            "classified_as": [
                {
                    "id": "aat:300311706",
                    "label": "call numbers",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/11013/050",
            "type": "Identifier",
            "value": "ND212 .N39"
        }
    ]
}

The scheme relating MARC fields to classifications is:

MARC field combination Classification
050$a, 050$b aat:300311706 ("call numbers")
245$a, 245$b aat:300404670 ("preferred terms")

Dates

MARC field 260$c contains a string with the human-readable date of publication, which we render as the TimeSpan of the object's production and interpret the string to also get machine-processable dates:

    {
        "id": "http://data.okeeffemuseum.org/library/11013",
        "type": "ManMadeObject",
        "produced_by": {
            "type": "Production",
            "timespan": {
                "begin_of_the_begin": "1973-01-01T00:00:00",
                "end_of_the_end": "1973-01-01T00:00:00",
                "id": "http://data.okeeffemuseum.org/library/426/production/timespan",
                "label": "c1973",
                "type": "TimeSpan"
            },
        }
    }

Makers and publishers

MARC fields 100$a and 260$b usually carry author names and publisher names so these are rendered using the techniques pattern:

{
        "id": "http://data.okeeffemuseum.org/library/426/production",
        "type": "ManMadeObject",
        "produced_by": {
        "consists_of": [
            {
                "carried_out_by": [
                    {
                        "id": "http://data.okeeffemuseum.org/person/1325",
                        "identified_by": [
                            {
                                "classified_as": [
                                    {
                                        "id": "aat:300404670",
                                        "label": "preferred terms",
                                        "type": "Type"
                                    }
                                ],
                                "id": "http://data.okeeffemuseum.org/person/1325/name/0",
                                "type": "Name",
                                "value": "Museum of Modern Art"
                            }
                        ],
                        "label": "Museum of Modern Art",
                        "type": "Actor",
                        "exact_match": [
                            "ulan:500303609",
                            "http://id.loc.gov/authorities/names/n50056582"
                        ],
                    }
                ],
                "technique": [
                    {
                        "id": "relators:pbl",
                        "label": "Publisher",
                        "type": "Type"
                    }
                ],
                "type": "Production"
            },
            {
                "carried_out_by": [
                    {
                        "id": "http://data.okeeffemuseum.org/person/szarkowski-john-",
                        "identified_by": [
                            {
                                "classified_as": [
                                    {
                                        "id": "aat:300404670",
                                        "label": "preferred terms",
                                        "type": "Type"
                                    }
                                ],
                                "id": "http://data.okeeffemuseum.org/person/szarkowski-john-/name",
                                "type": "Name",
                                "value": "Szarkowski, John."
                            }
                        ],
                        "type": "Actor"
                    }
                ],
                "technique": [
                    {
                        "id": "relators:aut",
                        "label": "Author",
                        "type": "Type"
                    }
                ],
                "type": "Production"
            }
        ],
}

Subject headings

MARCXML subject headings are expressed in the fields 655$a, 655$0, and 655$2. We render these as the about property of the object:

{
    "@context": "https://linked.art/ns/v1/linked-art.json",
    "id": "http://data.okeeffemuseum.org/library/11013",
    "type": "ManMadeObject",
    "about": [
        {
            "label": "Catalogs",
            "type": "Type"
        }
    ]
}

Descriptive Cataloging

MARC uses fields 300$a, 300$b, and 300$c to express aspects of the book's format, while 500$3, 500$5, and 500$a contain repository-specific notes (in the case of the O'Keeffe collection this is often the location where ephemera were found) and 590$a contains conditino information. These become nodes with the type LinguisticObject:

{
    "id": "http://data.okeeffemuseum.org/library/11013",
    "referred_to_by": [
        {
            "classified_as": [
                {
                    "id": "aat:300266038",
                    "label": "format",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/426/300abc",
            "type": "LinguisticObject",
            "value": "215 p. : ill. ; 28 cm."
        },
        {
            "classified_as": [
                {
                    "id": "aat:300028702",
                    "label": "inscriptions",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/426/590a",
            "type": "LinguisticObject",
            "value": "abq copy : good condition ; dust jacket : fair condition."
        },
        {
            "classified_as": [
                {
                    "id": "aat:300411780",
                    "label": "descriptions (documents)",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/426/500a",
            "type": "LinguisticObject",
            "value": "Ephemera found in front of t.p. : compliments card from The Department of Photography, The Museum of Modern Art, New York.<br>Georgia O'Keeffe Personal Library.<br>Page marker found between pp. 74-[75]."
        }
    ]
}

Descriptive cataloguing is classified with this scheme: The scheme relating MARC fields to classifications is: | MARC field combination | Classification | | -------- | ---------------| |300$a, 300$b, 300$c|aat:300266038 ("format")| |590$a|aat:300028702 ("inscriptions")| |500$3, 500$5, 500$a|aat:300411780 ("descriptions")|

Example document

So we can see the the full scope of the library transformation is clear by looking at a MARCXML snippet:

<record>
  <controlfield tag="001">11013</controlfield>
  <datafield tag="050" ind1=" " ind2=" ">
    <subfield code="a">ND212</subfield>
    <subfield code="b">.N39</subfield>
  </datafield>
  <datafield tag="245" ind1="1" ind2="0">
    <subfield code="a">100 American painters of the 20th century;</subfield>
    <subfield code="b">works selected from the collections of the Metropolitan Museum of Art.</subfield>
    <subfield code="c">With an introd. by Robert Beverly Hale.</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="a">New York,</subfield>
    <subfield code="c">1950.</subfield>
  </datafield>
  <datafield tag="300" ind1=" " ind2=" ">
    <subfield code="a">xxiii, 111 pages</subfield>
    <subfield code="b">plates (some color)</subfield>
    <subfield code="c">26 cm.</subfield>
  </datafield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="3">SColl abq</subfield>
    <subfield code="a">Ephemera found in front of Introduction p. : booklet : American Painters of the 20th Century : represented in the collections of the Metropolitan Museum of Art.</subfield>
    <subfield code="5">NmSfGOM</subfield>
  </datafield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="3">SColl abq</subfield>
    <subfield code="a">Ephemera found in front of t.p. : compliments card from the Metropolitan Museum of Art ; booklet : 20th Century Painters: a special exhibition of oils, water colors and drawings selected from the collections of American Art in the Metropolitan Museum, June 16, 1950.</subfield>
    <subfield code="5">NmSfGOM</subfield>
  </datafield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="3">SColl abq</subfield>
    <subfield code="a">Georgia O'Keeffe Personal Library.</subfield>
    <subfield code="5">NmSfGOM</subfield>
  </datafield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="3">SColl abq</subfield>
    <subfield code="a">Page marker found between pp. 58-59.</subfield>
    <subfield code="5">NmSfGOM</subfield>
  </datafield>
  <datafield tag="655" ind1=" " ind2="7">
    <subfield code="a">Catalogs.</subfield>
    <subfield code="2">fast</subfield>
    <subfield code="0">(OCoLC)fst01423692</subfield>
    </datafield>
</record>

And its transformed JSON-LD:

{
    "@context": "https://linked.art/ns/v1/linked-art.json",
    "id": "http://data.okeeffemuseum.org/library/11013",
    "type": "ManMadeObject",
    "about": [
        {
            "label": "Catalogs",
            "type": "Type"
        }
    ],
    "classified_as": [
        {
            "id": "aat:300028051",
            "label": "books",
            "type": "Type"
        }
    ],
    "identified_by": [
        {
            "classified_as": [
                {
                    "id": "aat:300404670",
                    "label": "preferred terms",
                    "type": "Type"
                }
            ],
            "type": "Name",
            "value": "100 American painters of the 20th century;works selected from the collections of the Metropolitan Museum of Art."
        },
        {
            "classified_as": [
                {
                    "id": "aat:300311706",
                    "label": "call numbers",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/11013/050",
            "type": "Identifier",
            "value": "ND212 .N39"
        }
    ],
    "produced_by": {
        "carried_out_by": [],
        "consists_of": [
            {
                "carried_out_by": [],
                "classified_as": [],
                "id": "http://data.okeeffemuseum.org/library/11013/production/publishing",
                "technique": [
                    {
                        "id": "relators:pbl",
                        "label": "Publisher",
                        "type": "Type"
                    }
                ],
                "type": "Production"
            }
        ],
        "id": "http://data.okeeffemuseum.org/library/11013/production",
        "timespan": {
            "begin_of_the_begin": "1950-01-01T00:00:00",
            "end_of_the_end": "1950-01-01T00:00:00",
            "id": "http://data.okeeffemuseum.org/library/11013/production/timespan",
            "label": "1950",
            "type": "TimeSpan"
        },
        "type": "Production"
    },
    "referred_to_by": [
        {
            "classified_as": [
                {
                    "id": "aat:300411780",
                    "label": "descriptions (documents)",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/11013/500a",
            "type": "LinguisticObject",
            "value": "Ephemera found in front of Introduction p. : booklet : American Painters of the 20th Century : represented in the collections of the Metropolitan Museum of Art.<br>Ephemera found in front of t.p. : compliments card from the Metropolitan Museum of Art ; booklet : 20th Century Painters: a special exhibition of oils, water colors and drawings selected from the collections of American Art in the Metropolitan Museum, June 16, 1950.<br>Georgia O'Keeffe Personal Library.<br>Page marker found between pp. 58-59."
        },
        {
            "classified_as": [
                {
                    "id": "aat:300028702",
                    "label": "inscriptions",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/11013/590a",
            "type": "LinguisticObject",
            "value": "abq copy : fair condition."
        },
        {
            "classified_as": [
                {
                    "id": "aat:300266038",
                    "label": "format",
                    "type": "Type"
                }
            ],
            "id": "http://data.okeeffemuseum.org/library/11013/300abc",
            "type": "LinguisticObject",
            "value": "xxiii, 111 pages plates (some color) 26 cm."
        }
    ],
}