Package org.apache.commons.io.input
Class BOMInputStream
- java.lang.Object
-
- java.io.InputStream
-
- java.io.FilterInputStream
-
- org.apache.commons.io.input.ProxyInputStream
-
- org.apache.commons.io.input.BOMInputStream
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
public class BOMInputStream extends ProxyInputStream
This class is used to wrap a stream that includes an encodedByteOrderMark
as its first bytes. This class detects these bytes and, if required, can automatically skip them and return the subsequent byte as the first byte in the stream. TheByteOrderMark
implementation has the following pre-defined BOMs:- UTF-8 -
ByteOrderMark.UTF_8
- UTF-16BE -
ByteOrderMark.UTF_16LE
- UTF-16LE -
ByteOrderMark.UTF_16BE
- UTF-32BE -
ByteOrderMark.UTF_32LE
- UTF-32LE -
ByteOrderMark.UTF_32BE
Example 1 - Detect and exclude a UTF-8 BOM
BOMInputStream bomIn = new BOMInputStream(in); if (bomIn.hasBOM()) { // has a UTF-8 BOM }
Example 2 - Detect a UTF-8 BOM (but don't exclude it)
boolean include = true; BOMInputStream bomIn = new BOMInputStream(in, include); if (bomIn.hasBOM()) { // has a UTF-8 BOM }
Example 3 - Detect Multiple BOMs
BOMInputStream bomIn = new BOMInputStream(in, ByteOrderMark.UTF_16LE, ByteOrderMark.UTF_16BE, ByteOrderMark.UTF_32LE, ByteOrderMark.UTF_32BE ); if (bomIn.hasBOM() == false) { // No BOM found } else if (bomIn.hasBOM(ByteOrderMark.UTF_16LE)) { // has a UTF-16LE BOM } else if (bomIn.hasBOM(ByteOrderMark.UTF_16BE)) { // has a UTF-16BE BOM } else if (bomIn.hasBOM(ByteOrderMark.UTF_32LE)) { // has a UTF-32LE BOM } else if (bomIn.hasBOM(ByteOrderMark.UTF_32BE)) { // has a UTF-32BE BOM }
- Since:
- 2.0
- See Also:
ByteOrderMark
, Wikipedia - Byte Order Mark
-
-
Field Summary
Fields Modifier and Type Field Description private java.util.List<ByteOrderMark>
boms
BOMs are sorted from longest to shortest.private ByteOrderMark
byteOrderMark
private static java.util.Comparator<ByteOrderMark>
ByteOrderMarkLengthComparator
Compares ByteOrderMark objects in descending length order.private int
fbIndex
private int
fbLength
private int[]
firstBytes
private boolean
include
private boolean
markedAtStart
private int
markFbIndex
-
Constructor Summary
Constructors Constructor Description BOMInputStream(java.io.InputStream delegate)
Constructs a new BOM InputStream that excludes aByteOrderMark.UTF_8
BOM.BOMInputStream(java.io.InputStream delegate, boolean include)
Constructs a new BOM InputStream that detects a aByteOrderMark.UTF_8
and optionally includes it.BOMInputStream(java.io.InputStream delegate, boolean include, ByteOrderMark... boms)
Constructs a new BOM InputStream that detects the specified BOMs and optionally includes them.BOMInputStream(java.io.InputStream delegate, ByteOrderMark... boms)
Constructs a new BOM InputStream that excludes the specified BOMs.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private ByteOrderMark
find()
Find a BOM with the specified bytes.ByteOrderMark
getBOM()
Return the BOM (Byte Order Mark).java.lang.String
getBOMCharsetName()
Return the BOM charset Name -ByteOrderMark.getCharsetName()
.boolean
hasBOM()
Indicates whether the stream contains one of the specified BOMs.boolean
hasBOM(ByteOrderMark bom)
Indicates whether the stream contains the specified BOM.void
mark(int readlimit)
Invokes the delegate'smark(int)
method.private boolean
matches(ByteOrderMark bom)
Check if the bytes match a BOM.int
read()
Invokes the delegate'sread()
method, detecting and optionally skipping BOM.int
read(byte[] buf)
Invokes the delegate'sread(byte[])
method, detecting and optionally skipping BOM.int
read(byte[] buf, int off, int len)
Invokes the delegate'sread(byte[], int, int)
method, detecting and optionally skipping BOM.private int
readFirstBytes()
This method reads and either preserves or skips the first bytes in the stream.void
reset()
Invokes the delegate'sreset()
method.long
skip(long n)
Invokes the delegate'sskip(long)
method, detecting and optionally skipping BOM.-
Methods inherited from class org.apache.commons.io.input.ProxyInputStream
afterRead, available, beforeRead, close, handleIOException, markSupported
-
-
-
-
Field Detail
-
include
private final boolean include
-
boms
private final java.util.List<ByteOrderMark> boms
BOMs are sorted from longest to shortest.
-
byteOrderMark
private ByteOrderMark byteOrderMark
-
firstBytes
private int[] firstBytes
-
fbLength
private int fbLength
-
fbIndex
private int fbIndex
-
markFbIndex
private int markFbIndex
-
markedAtStart
private boolean markedAtStart
-
ByteOrderMarkLengthComparator
private static final java.util.Comparator<ByteOrderMark> ByteOrderMarkLengthComparator
Compares ByteOrderMark objects in descending length order.
-
-
Constructor Detail
-
BOMInputStream
public BOMInputStream(java.io.InputStream delegate)
Constructs a new BOM InputStream that excludes aByteOrderMark.UTF_8
BOM.- Parameters:
delegate
- the InputStream to delegate to
-
BOMInputStream
public BOMInputStream(java.io.InputStream delegate, boolean include)
Constructs a new BOM InputStream that detects a aByteOrderMark.UTF_8
and optionally includes it.- Parameters:
delegate
- the InputStream to delegate toinclude
- true to include the UTF-8 BOM or false to exclude it
-
BOMInputStream
public BOMInputStream(java.io.InputStream delegate, ByteOrderMark... boms)
Constructs a new BOM InputStream that excludes the specified BOMs.- Parameters:
delegate
- the InputStream to delegate toboms
- The BOMs to detect and exclude
-
BOMInputStream
public BOMInputStream(java.io.InputStream delegate, boolean include, ByteOrderMark... boms)
Constructs a new BOM InputStream that detects the specified BOMs and optionally includes them.- Parameters:
delegate
- the InputStream to delegate toinclude
- true to include the specified BOMs or false to exclude themboms
- The BOMs to detect and optionally exclude
-
-
Method Detail
-
hasBOM
public boolean hasBOM() throws java.io.IOException
Indicates whether the stream contains one of the specified BOMs.- Returns:
- true if the stream has one of the specified BOMs, otherwise false if it does not
- Throws:
java.io.IOException
- if an error reading the first bytes of the stream occurs
-
hasBOM
public boolean hasBOM(ByteOrderMark bom) throws java.io.IOException
Indicates whether the stream contains the specified BOM.- Parameters:
bom
- The BOM to check for- Returns:
- true if the stream has the specified BOM, otherwise false if it does not
- Throws:
java.lang.IllegalArgumentException
- if the BOM is not one the stream is configured to detectjava.io.IOException
- if an error reading the first bytes of the stream occurs
-
getBOM
public ByteOrderMark getBOM() throws java.io.IOException
Return the BOM (Byte Order Mark).- Returns:
- The BOM or null if none
- Throws:
java.io.IOException
- if an error reading the first bytes of the stream occurs
-
getBOMCharsetName
public java.lang.String getBOMCharsetName() throws java.io.IOException
Return the BOM charset Name -ByteOrderMark.getCharsetName()
.- Returns:
- The BOM charset Name or null if no BOM found
- Throws:
java.io.IOException
- if an error reading the first bytes of the stream occurs
-
readFirstBytes
private int readFirstBytes() throws java.io.IOException
This method reads and either preserves or skips the first bytes in the stream. It behaves like the single-byteread()
method, either returning a valid byte or -1 to indicate that the initial bytes have been processed already.- Returns:
- the byte read (excluding BOM) or -1 if the end of stream
- Throws:
java.io.IOException
- if an I/O error occurs
-
find
private ByteOrderMark find()
Find a BOM with the specified bytes.- Returns:
- The matched BOM or null if none matched
-
matches
private boolean matches(ByteOrderMark bom)
Check if the bytes match a BOM.- Parameters:
bom
- The BOM- Returns:
- true if the bytes match the bom, otherwise false
-
read
public int read() throws java.io.IOException
Invokes the delegate'sread()
method, detecting and optionally skipping BOM.- Overrides:
read
in classProxyInputStream
- Returns:
- the byte read (excluding BOM) or -1 if the end of stream
- Throws:
java.io.IOException
- if an I/O error occurs
-
read
public int read(byte[] buf, int off, int len) throws java.io.IOException
Invokes the delegate'sread(byte[], int, int)
method, detecting and optionally skipping BOM.- Overrides:
read
in classProxyInputStream
- Parameters:
buf
- the buffer to read the bytes intooff
- The start offsetlen
- The number of bytes to read (excluding BOM)- Returns:
- the number of bytes read or -1 if the end of stream
- Throws:
java.io.IOException
- if an I/O error occurs
-
read
public int read(byte[] buf) throws java.io.IOException
Invokes the delegate'sread(byte[])
method, detecting and optionally skipping BOM.- Overrides:
read
in classProxyInputStream
- Parameters:
buf
- the buffer to read the bytes into- Returns:
- the number of bytes read (excluding BOM) or -1 if the end of stream
- Throws:
java.io.IOException
- if an I/O error occurs
-
mark
public void mark(int readlimit)
Invokes the delegate'smark(int)
method.- Overrides:
mark
in classProxyInputStream
- Parameters:
readlimit
- read ahead limit
-
reset
public void reset() throws java.io.IOException
Invokes the delegate'sreset()
method.- Overrides:
reset
in classProxyInputStream
- Throws:
java.io.IOException
- if an I/O error occurs
-
skip
public long skip(long n) throws java.io.IOException
Invokes the delegate'sskip(long)
method, detecting and optionally skipping BOM.- Overrides:
skip
in classProxyInputStream
- Parameters:
n
- the number of bytes to skip- Returns:
- the number of bytes to skipped or -1 if the end of stream
- Throws:
java.io.IOException
- if an I/O error occurs
-
-