API Reference¶
Note: For a comprehensive API reference with full parameter details, examples, and data class documentation, see API_REFERENCE.md.
Core Functions¶
geofabric.open(uri)¶
Open a dataset from various sources.
Parameters:
- uri (str): Data source URI
Supported URI formats:
| Source | Format | Example |
|--------|--------|---------|
| Local files | file:///path | file:///data/buildings.parquet |
| S3 | s3://bucket/key | s3://my-bucket/data.parquet |
| GCS | gs://bucket/key | gs://my-bucket/data.parquet |
| Azure | az://container/path | az://mycontainer/data.parquet |
| PostGIS | postgresql://... | postgresql://user:pass@host/db?table=mytable |
| STAC | stac://... | stac://catalog-url/collection |
Returns: Dataset
Dataset¶
The Dataset class represents a connection to a data source.
Methods¶
| Method | Description |
|---|---|
query() |
Create a new Query builder |
within(roi) |
Filter by region of interest |
where(predicate) |
Add SQL WHERE clause |
select(columns) |
Select specific columns |
limit(n) |
Limit number of rows |
head(n=10) |
Get first n rows as DataFrame |
sample(n=10) |
Get random sample as DataFrame |
count() |
Count total rows |
validate() |
Validate geometries |
stats() |
Compute dataset statistics |
Properties¶
| Property | Description |
|---|---|
columns |
List of column names |
dtypes |
Dictionary of column types |
Query¶
The Query class provides a lazy, chainable query builder.
Selection & Filtering¶
| Method | Description |
|---|---|
select(columns) |
Select columns (str or list) |
where(predicate) |
Add WHERE clause |
within(roi) |
Filter by ROI (bbox or WKT) |
limit(n) |
Limit rows |
Output Methods¶
| Method | Description |
|---|---|
sql() |
Get generated SQL string |
to_pandas() |
Execute and return DataFrame |
to_arrow() |
Execute and return Arrow Table |
to_geopandas() |
Execute and return GeoDataFrame |
to_parquet(path) |
Write to Parquet file |
to_geojson(path) |
Write to GeoJSON file |
to_geopackage(path) |
Write to GeoPackage |
to_flatgeobuf(path) |
Write to FlatGeoBuf |
to_csv(path) |
Write to CSV (with WKT geometry) |
to_pmtiles(path) |
Write to PMTiles (requires tippecanoe) |
show() |
Visualize in notebook (requires viz extras) |
Spatial Transformations¶
| Method | Description |
|---|---|
buffer(distance, unit) |
Buffer geometries |
simplify(tolerance) |
Simplify geometries |
transform(to_srid) |
Transform CRS |
centroid() |
Get geometry centroids |
convex_hull() |
Get convex hulls |
envelope() |
Get bounding boxes |
boundary() |
Extract boundaries |
make_valid() |
Repair invalid geometries |
densify(max_distance) |
Add vertices |
explode() |
Split multi-geometries |
collect() |
Gather into MultiGeometry |
dissolve(by) |
Merge geometries by attribute |
clip(wkt) |
Clip to geometry (intersection) |
erase(wkt) |
Erase geometry (difference) |
symmetric_difference(wkt) |
XOR with geometry |
point_on_surface() |
Get point on surface |
reverse() |
Reverse vertex order |
flip_coordinates() |
Swap X/Y coordinates |
Computed Columns¶
| Method | Description |
|---|---|
with_area(col_name) |
Add area column |
with_length(col_name) |
Add length/perimeter column |
with_perimeter(col_name) |
Add perimeter column |
with_bounds(prefix) |
Add minx, miny, maxx, maxy columns |
with_distance_to(wkt, col_name) |
Add distance column |
with_x(col_name) |
Add X coordinate column |
with_y(col_name) |
Add Y coordinate column |
with_coordinates(x_col, y_col) |
Add X and Y columns |
with_geometry_type(col_name) |
Add geometry type column |
with_num_points(col_name) |
Add vertex count column |
with_is_valid(col_name) |
Add validity check column |
Spatial Joins¶
| Method | Description |
|---|---|
sjoin(other, predicate, how) |
Spatial join with predicates: intersects, within, contains, touches, crosses, overlaps |
nearest(other, k, max_distance) |
K-nearest neighbor join (conflicting columns get _right suffix) |
Analytics¶
| Method | Description |
|---|---|
aggregate(agg) |
GROUP BY with aggregations |
count() |
Count rows |
head(n) |
Get first n rows |
sample(n, seed) |
Get random sample |
describe() |
Get statistics |
explain() |
Get query plan |
ROI (Region of Interest)¶
geofabric.roi.bbox(minx, miny, maxx, maxy, srid=4326)¶
Create a bounding box ROI.
geofabric.roi.wkt(wkt_text, srid=4326)¶
Create a WKT geometry ROI.
Configuration¶
Programmatic configuration for credentials and settings. Takes precedence over environment variables.
Cloud Storage¶
| Function | Parameters | Description |
|---|---|---|
configure_s3() |
access_key_id, secret_access_key, region, session_token, endpoint, use_ssl | AWS S3 credentials |
configure_gcs() |
access_key_id, secret_access_key, project | Google Cloud Storage |
configure_azure() |
account_name, account_key, connection_string, sas_token | Azure Blob Storage |
Databases¶
| Function | Parameters | Description |
|---|---|---|
configure_postgis() |
host, port, database, user, password, sslmode | PostgreSQL/PostGIS defaults |
APIs & HTTP¶
| Function | Parameters | Description |
|---|---|---|
configure_stac() |
api_key, headers, default_catalog | STAC catalog authentication |
configure_http() |
proxy, timeout, headers, verify_ssl | Global HTTP settings |
Utility¶
| Function | Description |
|---|---|
reset_config() |
Clear all programmatic configuration |
get_config() |
Get current configuration object |
Example:
import geofabric as gf
gf.configure_s3(
access_key_id="AKIA...",
secret_access_key="...",
region="us-east-1"
)
ds = gf.open("s3://my-bucket/data.parquet?anonymous=false")
See Authentication Guide for detailed usage.
Validation¶
geofabric.validate_geometries(engine, sql, geometry_col, id_col)¶
Validate geometries and return ValidationResult.
geofabric.compute_stats(engine, sql, geometry_col)¶
Compute dataset statistics and return DatasetStats.
Data Classes¶
ValidationResult¶
total_rows: Total row countvalid_count: Valid geometry countinvalid_count: Invalid geometry countnull_count: NULL geometry countissues: List ofValidationIssue
DatasetStats¶
row_count: Total rowscolumn_count: Number of columnscolumns: List of column namesdtypes: Dictionary of column typesbounds: Geometry bounds (minx, miny, maxx, maxy)geometry_type: Geometry typecrs: Coordinate reference systemnull_counts: NULL counts per column
CLI Commands¶
| Command | Description |
|---|---|
gf sql URI QUERY |
Run SQL query |
gf pull URI OUT |
Extract data subset |
gf info URI |
Show dataset info |
gf stats URI |
Show statistics |
gf validate URI |
Validate geometries |
gf head URI |
Show first rows |
gf sample URI OUT |
Random sample |
gf buffer URI OUT |
Buffer geometries |
gf simplify URI OUT |
Simplify geometries |
gf transform URI OUT |
Transform CRS |
gf centroid URI OUT |
Compute centroids |
gf convex-hull URI OUT |
Compute convex hulls |
gf dissolve URI OUT |
Dissolve geometries |
gf add-area URI OUT |
Add area column |
gf add-length URI OUT |
Add length column |
gf overture download |
Download Overture data |