graph LR; ftp(ftp.crossref.org); ingest(Cayenne Task Ingest Journals); index(Cayenne ES Journals Index) api(Cayenne /v1/journals); ftp-->ingest; ingest-->index; index-->api;
Journals are ingested by Cayenne from a location configured as
[:location :cr-titles-csv], currently
http://ftp.crossref.org/titlelist/titleFile.csv. It is a CSV file containing information about journals, one per line.
The journals’ records are then indexed using Elasticsearch’s bulk API with a series of
update actions. Each such action results in adding a new record or updating the existing one.
update is used rather than
index, so that we do not erase the journal subjects that are added to journal records in a separate process (see funding data).
Journals are ingested once a day.