Overview
- Path type: Directory with
Annotations/and optionalJPEGImages/ - Lossiness: Lossy (see below)
- Bbox format: Pixel-space XYXY
[xmin, ymin, xmax, ymax] - Use case: Legacy datasets, academic benchmarks
Directory Structure
Key Components
Annotations/: XML files, one per image (required)JPEGImages/: Image files (optional, not read by Panlabel)
XML Structure
Bounding Box Format
VOC uses pixel-space XYXY coordinates (same as IR):xmin: Left edge in pixelsymin: Top edge in pixelsxmax: Right edge in pixelsymax: Bottom edge in pixels
Object Attributes
VOC supports several object-level attributes:pose: Object pose (e.g., “Frontal”, “Left”, “Unspecified”)truncated: 1 if object is cut off at image boundary, 0 otherwisedifficult: 1 if object is hard to recognize, 0 otherwiseoccluded: 1 if object is occluded, 0 otherwise (non-standard but supported)
Attribute Mapping
Reading:- Retrieves attributes from IR annotation
- Normalizes boolean values:
true/yes/1→1false/no/0→0- Other values → omitted
Reader Behavior
Input Path
Accepts:- Dataset root containing
Annotations/ Annotations/directory directly
Reading Process
- Discover layout (find
Annotations/directory) - Scan
Annotations/flat only (non-recursive) - Parse each XML file:
- Extract
<filename>,<width>,<height> - Extract
<depth>(stored as image attribute) - Parse all
<object>elements
- Extract
- Assign deterministic IDs:
- Image IDs: by
<filename>(lexicographic) - Category IDs: by class name (lexicographic)
- Annotation IDs: by XML file order, then
<object>order
- Image IDs: by
Coordinate Policy
Readsxmin/ymin/xmax/ymax exactly as provided (no 0/1-based adjustment).
Nested XML Warning
Nested XML files (e.g.,Annotations/train/img.xml) are skipped with a warning:
Writer Behavior
Output Structure
Writing Process
- Create
Annotations/andJPEGImages/directories - Write
JPEGImages/README.txtplaceholder - For each image:
- Create XML file at
Annotations/<stem>.xml - Preserve subdirectory structure from
file_name - Write all annotations sorted by annotation ID
- Create XML file at
- Does not copy image binaries
Depth Attribute
Retrieves<depth> from image attribute "depth" if present:
Boolean Normalization
Writes normalized boolean attributes:Empty Images
Writes XML files for images without annotations:Lossiness
VOC format is lossy. Not preserved:- Dataset-level metadata/licenses
- Image-level license/date metadata
- Annotation confidence
- Category supercategory
- Custom attributes (except
pose,truncated,difficult,occluded)
- Image filenames and dimensions
- Image depth (as attribute)
- Category names
- Bounding box coordinates (XYXY)
- Standard object attributes (
pose,truncated,difficult,occluded)
Usage
Read VOC
Annotations/ directly:
Write VOC
Subdirectory Structure
VOC preserves subdirectory structure in output: Input IR:See Also
YOLO Format
Another directory-based format
Format Overview
Compare all supported formats