public class HadoopFileStore extends Object implements FileStore
FileStore
implementation based on the Hadoop API.
An HadoopFileStore
stores its files in an Hadoop FileSystem
, under a certain,
configurable root path; the filesystem can be any of the filesystems supported by the Hadoop
API, including the local (raw) filesystem and the distributed HDFS filesystem.
Files are stored in a a two-level directory structure, where first level directories reflect the MIME types of stored files, and second level directories are buckets of files whose name is obtained by hashing the filename; buckets are used in order to equally split a large number of files in several subdirectories, overcoming possible filesystem limitations in terms of maximum number of files storable in a directory.
Constructor and Description |
---|
HadoopFileStore(FileSystem fileSystem,
String path)
Creates a new
HadoopFileStore storing files in the FileSystem and under the
rootPath specified. |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes this
Component object. |
void |
delete(String fileName)
Deletes a file on the
FileStore . |
void |
init()
Initializes the
Component with the supplied Runtime object. |
Stream<String> |
list()
Lists all the files stored on the
FileStore . |
InputStream |
read(String fileName)
Read a file stored on the
FileStore . |
String |
toString() |
OutputStream |
write(String fileName)
Writes a new file to the
FileStore . |
public HadoopFileStore(FileSystem fileSystem, @Nullable String path)
HadoopFileStore
storing files in the FileSystem
and under the
rootPath
specified.fileSystem
- the file system, not nullpath
- the root path where to store files, possibly relative to the filesystem working
directory; if null, the default root path files
will be usedpublic void init() throws IOException
Component
Component
with the supplied Runtime
object. This method is
called after the instantiation of a Component
and before any other instance method
is called. It provides a Runtime
that can be used to access runtime services such
as locking, serialization and filesystem access. The Component
is allowed to
perform any initialization operation that is necessary in order to become functional; on
failure, these operations may result in a IOException
being thrown.init
in interface Component
IOException
- in case initialization failspublic InputStream read(String fileName) throws FileMissingException, IOException
FileStore
FileStore
. The method returns an InputStream
over
the content of the file, starting from its first byte. Sequential access and forward
seeking ( InputStream.skip(long)
) are supported. Note that multiple concurrent read
operations on the same file are allowed.read
in interface FileStore
fileName
- the name of the fileInputStream
allowing access to the content of the fileFileMissingException
- in case no file exists for the name specified, resulting either from a caller
error or from an external modification (deletion) to the FileStore
IOException
- in case the file cannot be accessed for whatever reason, with no implication on
the fact that the file exists or does not exist (e.g., the whole
FileStore
may temporarily be not accessible)public OutputStream write(String fileName) throws FileExistsException, IOException
FileStore
FileStore
. The method creates a file with the name
specified, and returns an OutputStream
that can be used to write the content of the
file. Writing is completed by closing that stream, which forces written data to be flushed
to disk; errors in writing the file may result in the stream being forcedly closed and a
truncated file to be written. The file being written may be listed by FileStore.list()
(depending on the FileStore
implementation) but should not be accessed for read or
deletion until writing is completed; the consequences of doing so are not specified.write
in interface FileStore
fileName
- the name of the fileOutputStream
where the content of the file can be written toFileExistsException
- in case a file with the same name already exists, resulting either from a
caller error or from an external modification (file creation) to the
FileStore
.IOException
- in case the file cannot be created for whatever reason, with no implication on
the fact that another file with the same name already existspublic void delete(String fileName) throws FileMissingException, IOException
FileStore
FileStore
. An exception is thrown if the file specified does
not exist. After the method returns, the file cannot be accessed anymore from
FileStore.read(String)
or FileStore.delete(String)
. Deleting a file being read is allowed and
eventually results in the deletion of the file; whether the concurrent read operation fails
or is permitted to complete is an implementation detail.delete
in interface FileStore
fileName
- the name of the fileFileMissingException
- in case the file specified does not exist, possibly because of an external
modification to the FileStore
IOException
- in case the file cannot be accessed for whatever reason, with no implication on
the fact that the file specified actually existspublic Stream<String> list() throws IOException
FileStore
FileStore
. The method returns a Stream
supporting streaming access to the names of the files stored in the FileStore
.
Concurrent modifications to the FileStore
(FileStore.write(String)
,
FileStore.delete(String)
) may be supported, depending on the implementation: in that case,
the stream may either reflect the status before or after the concurrent modification.list
in interface FileStore
FileStore
IOException
- in case of failure, for whatever reasonpublic void close()
Component
Component
object. If the component has been initialized, closing a it
causes any allocated resource to be freed and any operation or transaction ongoing within
the component being aborted; in case the component has not been initialized yet, or
close()
has already being called, calling this method has no effect. Note that the
operation affects only the local Component
object and not any remote service this
object may rely on to implement its functionalities; in particular, such a remote service
is not shutdown by the operation, so that it can be accessed by other Component
instances possibly running in other (virtual) machines. Similarly, closing a
Component
object has no impact on stored data, that continues to be persisted and
will be accessed unchanged (provided no external modification occurs) the next time a
similarly configured Component
is created.Copyright © 2015–2016 FBK-irst. All rights reserved.