cdli-gh / data Goto Github PK
View Code? Open in Web Editor NEWThis is a copy of the daily dump of catalogue and ATF data from the Cuneiform Digital Library Initiative (http://cdli.ucla.edu)
Home Page: http://cdli.ucla.edu/bulk_data
This is a copy of the daily dump of catalogue and ATF data from the Cuneiform Digital Library Initiative (http://cdli.ucla.edu)
Home Page: http://cdli.ucla.edu/bulk_data
In cdliatf_unblocked.atf :
1 | BRM 3, 031 | @columnn 1
2 | BRM 3, 031 | @columnn 2
3 | BRM 3, 050 | @columnn 1
4 | BRM 3, 050 | @columnn 2
5 | CST 696 | @columnn 1
6 | CST 696 | @columnn 2
7 | PDT 1, 0377 | @columnn 1
8 | PDT 1, 0377 | @columnn 2
9 | PDT 1, 0388 | @columnn 1
10 | PDT 1, 0388 | @columnn 2
11 | PDT 1, 0396 | @columnn 1
12 | PDT 1, 0396 | @columnn 2
13 | PDT 1, 0398 | @columnn 1
14 | PDT 1, 0398 | @columnn 2
15 | PDT 1, 0482 | @columnn 1
16 | PDT 1, 0483 | @columnn 1
17 | PDT 1, 0483 | @columnn 2
18 | PDT 1, 0488 | @columnn 1
19 | PDT 1, 0488 | @columnn 2
20 | PDT 1, 0498 | @columnn 1
21 | PDT 1, 0498 | @columnn 2
22 | PDT 1, 0522 | @columnn 1
23 | PDT 1, 0522 | @columnn 2
24 | PDT 1, 0528 | @columnn 1
25 | PDT 1, 0528 | @columnn 2
26 | PDT 1, 0538 | @columnn 1
27 | PDT 1, 0538 | @columnn 2
28 | PDT 1, 0569 | @columnn 1
29 | PDT 1, 0569 | @columnn 2
30 | PDT 1, 0586 | @columnn 1
31 | PDT 1, 0586 | @columnn 2
32 | PDT 1, 0587 | @columnn 1
33 | PDT 1, 0587 | @columnn 2
34 | PDT 1, 0609 | @columnn 1
35 | PDT 1, 0609 | @columnn 2
36 | PDT 1, 0610 | @columnn 1
37 | PDT 1, 0610 | @columnn 2
38 | PDT 1, 0682 | @columnn 1
39 | PDT 1, 0682 | @columnn 2
40 | SAT 3, 1359 | @columnn 1
41 | SAT 3, 1359 | @columnn 2
42 | CBS 09275 | @columnn 1
43 | RIME 1.14.20.01, ex. 63 | @columnn 1
44 | RIME 2.13.01.01b | @columnn 1
45 | RIME 2.13.01.01b | @columnn 2
46 | RIME 4.01.05.04, ex. add120 | @columnn 1
47 | RIME 4.01.05.04, ex. add120 | @columnn 2
48 | RINAP 3/1 Sennacherib 24 composite | @columnn 1
49 | RIME 3/1.01.07.041, ex. add403 | @columnn 1
50 | ARTA 2015/003 | @columnn 1
51 | ARTA 2015/003 | @columnn 2
52 | CTMMA 1, 002 | @columnn 1
53 | CTMMA 1, 002 | @columnn 2
54 | TSÅ 0936 | @columnn 1
55 | TSÅ 0936 | @columnn 2
56 | MARI 05, p. 071, 104-105 no. 06 | @columnn 1
57 | MARI 05, p. 071, 104-105 no. 06 | @columnn 2
By the way, thanks for the amazing work!
In P218312 at the start of reverse column 2, there's what looks like a spurious language directive:
@column 2
# atf lang a
1. _in_
>>Q000008 colophon
In P498859 column 5 lines 29 and 33 some of the translation lines are missing a colon (':').
29. ki-iz-za-ta u3 ni-{szi}szir3-ta5
#tr.ts: kizzata u niširta
#tr.en curtailment and deduction
[...]
33. _a-sza3_ ad-di-na-asz2-szu a-na _nam_ ut-ter
#tr.ts eqel addinaššu ana pīhāti uttēr
#tr.en the field that I gave he returns to the province,
In P203171 obverse line 5 and reverse line 1, the second empty translation lines should probably be removed.
5. ki-es3-sa2{ki#}
#tr.en: Ki’eša,
#tr.en:
@reverse
1. e2 dingir-re-ne#
#tr.en: houses of the gods,
#tr.en:
In P497998 law 24 line 269, and law 42 line 553, the English translation line is missing a colon after the #tr.en
directive.
269. tal-ta-du-du-u2-ni _dam_-su
#tr.ts: taltaduduni aššassu
#tr.en had drawn away, his wife
>>QMAL 269
553. lu-u2 i-na sza-ku-ul-te
#tr.ts: lū ina šākulte
#tr.en whether at a banquet
>>QMAL 553
In P497998 law 23, line 241, the translation marker is missing the language designation. #tr.
should be #tr.en:
.
241. u2-usz-szu-ru-szu-nu
#tr.ts: uššurūšunu
#tr. they shall release them;
>>QMAL 241
law 24, line 269 is missing a colon (':') on the translation #tr directive.
269. tal-ta-du-du-u2-ni _dam_-su
#tr.ts: taltaduduni aššassu
#tr.en had drawn away, his wife
>>QMAL 269
law 42, line 553 is also missing a colon (':') on the translation #tr directive.
553. lu-u2 i-na sza-ku-ul-te
#tr.ts: lū ina šākulte
#tr.en whether at a banquet
>>QMAL 553
P273040 is missing a colon after the #tr.en
directive on line 9.
9. isz-tu _u4 5(disz)-kam_
#tr.ts: ištu ūmim ḫamišat
#tr.en since five days
In [P448526] on line 10, the translation directive is missing a colon. #tr.en
should be #tr.en:
.
9. gesztu2 nig2 mah-a
#tr.en: and tremendous intelligence
10. mu-na-ni-in-szum2-ma
#tr.en he (Enki) had given to him regarding it,
P382269 is missing a newline between the description and the #atf: lang
directive.
&P382269 = TCBI 1, 017 #atf: lang sux
@tablet
@obverse
In P499175 the language declaration is missing a colon after the #atf
directive and lang
is capitalized.
#atf Lang sux
should be
#atf: lang sux
P215684 is missing a colon (':') on the language directive. #atf lang
should be #atf: lang
.
P215684 = MVN 03, 027
#atf lang akk
These CDLI numbers have ATF in the data export, but there's no corresponding data in the catalogue csv files. They also don't display entries on the website.
The description field of P402035 is just a copy of the #atf: lang
directive on the next line. It should probably be AMT pl. 005 04 instead.
&P402035 = #atf: lang akk
#atf: lang akk
@tablet
@obverse
$ beginning broken
1'. [...] _{szim#}buluh# {szim#}li_ x [...]
In P272901, starting with reverse line 7' and going until the end of the tablet, the translation lines are missing a colon (':').
6'. i-na ki-sze2-er-szi2-im wa-asz2-ba-ku-ni _tug2-hi-a_ ta-ta#-[ad]-na-ni
#tr.en: in jail you sold textiles for me.
7'. i-na a-mu-tim u2 _tug2-hi-a_ ta-da-nim a-szur3-ma-lik
#tr.en When the iron and the textiles were sold in Aszszur-malik,
8'. kur-ub-isztar _szesz_-szu a-szur3-i-mi3-ti2 _dumu_ i-ku-pi2-a
#tr.en Kurub-Isztar, his brother Aszszur-imitti, son of Ikkupija,
[...]
In P000001 the seal_id
field contains only a ctrl-K
character. This doesn't cause problems with the web representation but is confusing in the cdli_catalogue.csv
export. The field should just be empty instead.
In P498314 column one line 9, the marker for the normalization line is missing a colon. #tr.ts ana...
should be #tr.ts: ana...
9. a-na {d}marduk be-li2-szu
#tr.ts ana Marduk bēlišu
#tr.en: to Marduk, his lord,
Can we update the readme with an example showing how the dump looks like? Probably showing the first five entries of the data.
In P497998 law 49 line 709, the second #tr.ts
should be #tr.en
.
709. [...] sza ki-i [...] _szesz ha-la_
#tr.ts: ... ša kî ... aḫi zitta
#tr.ts: ... who like ... a brother the share
In P491222 canto 3 line 44, the second #tr.ts
should be #tr.en
.
44. ana {disz}szub-szi-mesz-re-e-{d}szakkan2 u2-bil-la s,i-im-da
#tr.ts: ana šubši-mešrē-šakkan ūbila ṣīmda
#tr.ts: for Šubši-mešrē-Šakkan I brought a bandage.”
In P348900 the language declaration is missing the colon separator.
#atflang = akk
should be
#atf: lang akk
In P333111 the @tblet
label should be @tablet
.
&P333111 = AbB 11, 134
#atf: lang akk
@tblet
@obverse
1. a-na _{d}suen_-i-ri-ba-am#
2. qi2-bi2-ma
In P112338 reverse column 1 line 1 the empty second translation line should probably be removed.
#tr.en: workdays, at the field “Ninnudu,”
#tr.en:
Every commit is a new copy of the zip files; git doesn't handle binary files very well and so they all get stored in the history separately. If they were stored as text, then presumably git could just store the deltas and the total size of the repo, currently 2.2gb, would grow much more slowly.
P223125 is missing a colon (':') on the language directives. #atf lang
should be #atf: lang
.
&P223125 = TIM 03, 008
#atf lang akk
In P223130 the language declaration has a quote character instead of a colon separator.
#atf" lang akk
should be
#atf: lang akk
In P220931 reverse column 2 line 8 the empty translation line should probably be removed.
8. lugal
#tr.de: des Königs
#tr.de:
In P432241 line 6, the translation directive is missing a colon.
6. a mu-na-ru
#tr.en dedicated (this).
The final line should be #tr.en: dedicated (this).
In P464922 the @columnn
labels should be @column
.
@object cone
@surface a
@columnn 1
1. {d}nin-gir2-su
[...]
@columnn 2
1. mu-na-du3
These objects have a bare id number on the &
-line of their atf representation, without the normal P
-prefix, which is required to look up the entries on the website.
&504600 = CDLI Seals 013473 (physical)
[...]
&504601 = CDLI Seals 013474 (physical)
[...]
&504598 = CDLI Seals 013481 (physical)
Those should be&P504600
, &P504601
, and &P504598
, respectively.
In P323929 there is a typo in the language declaration. #atfz; lang sux
should be #atf: lang sux
.
P125779 is marked as a duplicate/copy of P126262. I'm not sure what the correct syntax for this is, but using a &
-line probably isn't it and confuses parsers.
&P125779 = PDT 1, 0363
& (obverse & obverse copy of P126262 = PDT 2, 0902)
Perhaps it should be a $
-line, >>
, or comment instead? Also it's not clear what obverse & obverse
refers to. Should that be obverse & reverse
?
reported by @jnovotny-lmu
error in line 124209 (115 columns instead of 63 columns) of cdli_catalogue_1of2.csv
Line 124209 of the file has
,,,,21198/zz001w65mw,"no atf",,nn,,,,,"University of Pennsylvania Museum of Archaeology and Anthropology, Philadelphia, Pennsylvania, USA",,"obv damaged",10/24/2005,,,10/21/2018,,"20051024 fitzgerald_upenn","N 2004",,,,,,,,,Administrative,,,?,124245,0,277115,,Akkadian,,clay,"N 2004",,tablet,"Neo-Babylonian (ca. 626-539 BC)",,"600ppi 20160630","unpublished unassigned ?","Nippur (mod. Nuffar)",,nd,,,,,,,"Account; payments of shekel of ?; 10x16x2(u.e.)x2(le.e.,,,,21198/zz001w65nd,"no atf",,nn,,,,,"University of Pennsylvania Museum of Archaeology and Anthropology, Philadelphia, Pennsylvania, USA",,"rev destroyed",10/24/2005,,-/VIII/-,10/21/2018,,"20051024 fitzgerald_upenn","N 2005",,,,,,,,,Administrative,,,?,124246,0,277116,,Akkadian,,clay,"N 2005",,tablet,"Middle Babylonian (ca. 1400-1100 BC)",,"600ppi 20160630","unpublished unassigned ?","Nippur (mod. Nuffar)",,nd,,,,,,,"Ledger; accounts for certain months?; 8x3 lines",,,?,"no translation",?
In P462024 line 116 the translation directive is missing a colon.
116. a-na
#tr.en to
The last line should be #tr.en: to
P204453 has an duplicate translation directive on obverse line 2.
2. gurum2#-ak kiszib3-ba
#tr.en: inspections, sealed documents,
#tr. inspections, sealed documents
The second #tr.
line should be removed.
In P100643 the & blank space
state markup should be $ blank space
(wrong sigil) and the previous line should probably be labelled 2
instead of the duplicate 3
.
@reverse
1. ur-ba-gara2
3. szu ba-ti
& blank space
3. mu us2-sa an-sza-an{ki} ba-hul
P272835 tablet line 4 Missing colon (':') on #tr directive.
#tr.en who, at the issuing of his order and the giving of his solemn decree
In P345966 the blank space
annotation is missing its initial $
sigil on the reverse after line 2. The line should be $ blank space
.
2. kiszib3# ur-{d}nu-musz#-da
#tr.en: the sealed tablet of Ur-Numushda.
blank space
3. mu# us2-sa {d}szu-{d}suen lugal#-e bad3 mar-tu mu-du3
#tr.en: Year after: “The king Šu-Suen erected the Amorite wall.”
P127683 is listed without a P-number. The record begins with &1
instead of &P127683
13. u4 2(u) 6(disz)-kam
14. iti ezem-{d}li9-si4?
&1 = RA 019, 040 21
#atf: lang sux
@tablet
@obverse
1. [...] x x x x
It looks like this was introduced inadvertently a 2015 June 24 edit.
In P464358 law 16 line 527, there's a space between the translation directive and the language code. #tr. en
should be #tr.en
. On lines 529 and 530, #tr.tr
should be #tr.ts
.
@law 16
[...]
527. la usz-te-s,i2-a-am
#tr.ts: lā uštēṣiam
#tr. en: has not let him go out,
[...]
529. id-da-ak
#tr.tr: iddâk
#tr.en: shall be killed.
@law 17
530. szum-ma a-wi-lum
#tr.tr: šumma awīlum
#tr.en: If a man
In P513803 the publication_history
catalogue data field contains a ctrl-K
before the second citation.
Groneberg, Brigitte, CM, 08 (1997) 084-093; �Streck, Michael P., JAOS 130 (2010) 561-571
In P212934 the language declaration has a typo. #atfz; lang sux
should be #atf: lang sux
.
In P338870 on the obverse, line 9, the translation is marked as another line of normalization and is missing a colon. The second #tr.ts
should be #tr.en:
9. u3 u2-sza-ri-a-kum
#tr.ts: u ušari’akkum
#tr.ts further I had (them) led to you.
There seems to be some corruption in the catalogue data export for P277115 and P277116. On line 124209 of cdli_catalogue_1of2.csv
, the sub-genre comments column of the first tablet stops abruptly, without a closing quotation mark, and is followed by the entry for the second tablet on the same line.
[...],"Account; payments of shekel of ?; 10x16x2(u.e.)x2(le.e.,,,,21198/zz001w65nd,"no atf",[..]
In P346149 column 1' line 6', the double ruling
annotation is marked as part of the English translation. It should probably use $-line markup to match other tablets.
6'. sza3 gi4-[...]
#tr.en: You can argue with me by means of your truthful(?) heart?
#tr.en: (double ruling)
7'. x [...]
The line before 7'
should instead be:
$ double ruling
The ATF record for P513444 has a spurious .jpg
after the CDLI id number on the first line.
&P513444.jpg = RIME 4.04.01.02, ex. add175
It should be
&P513444 = RIME 4.04.01.02, ex. add175
Some fields in the catalogue csv file have data in non-utf-8 encodings. This is confusing for readers, and also results in incorrect display on the object webpage.
For example in P222716 Frühdyn. Beterstatuetten displays as Fr√ºhdyn. Beterstatuetten
in the secondary publications field.
It's common in the CDLI comments field as well. For example in P282483 Fs. Košak displays as Fs Ko√∂ak
.
In P008365 #tr.en. (header)
should probably be #tr.en: (header)
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.