Apologies, in advance, for a possible duplicate.
I have an archive containing 117,426 files (each in the
N-TRIPLES format) that I wish to load into the default graph of a TDB dataset. Due to the large number of files, I need to be able to perform this import without manually selecting individual files for upload.
I am in Bash, with Jena and Fuseki distributions at my disposal.
If possible, I want to avoid the worst-case scenario of just writing a java application to do this. If I have to write a java application for this, what hooks exist in RIOT/TDB to perform programmatic bulk-loading?Answer1:
As a genenral comment, one way is to concatenate the N-Triples files to generate one single file.
You can load many files at once with either
tdbloader --loc DB ... your files ...
The 117,426 may strain you OS for a single command line invocation. You can pipe the files into
tdbloader (it's just like concatenating the files first)
... | tdbloader --loc DB -- -
... is some way to get bash to cat the files (possible from a subshell).
e.g. (you'll need to adjust to file all 117,426 files):
( for x in data*.nt do cat $x done ) | tdbloader --loc DB -- -