Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
SQL Server
Azure SQL Database
Azure SQL Managed Instance
SQL analytics endpoint in Microsoft Fabric
Warehouse in Microsoft Fabric
The OPENROWSET
function reads data from one or many files and returns the content as a rowset. Depending on a service, the file might be stored in Azure Blob Storage, Azure Data Lake storage, on-premises disk, network shares, etc. You can read various file formats such as text/CSV, Parquet, or JSON-lines.
The OPENROWSET
function can be referenced in the FROM
clause of a query as if it were a table name. It can be used to read data in SELECT
statement, or to update target data in the UPDATE
, INSERT
, DELETE
, MERGE
, CTAS
, or CETAS
statements.
OPENROWSET(BULK)
is designed for reading data from external data files.OPENROWSET
withoutBULK
is designed for reading from another database engine. For more information, see OPENROWSET (Transact-SQL).
This article and the argument set in the OPENROWSET(BULK)
varies between the platforms.
- For Microsoft Fabric syntax, select Fabric in the version dropdown list.
- For SQL Server, Azure SQL Database, and Azure SQL Managed Instance syntax, select your platform in the version dropdown list.
Details and links to similar examples on other platforms:
- For more information on
OPENROWSET
in Azure SQL Database, see Data virtualization with Azure SQL Database. - For more information on
OPENROWSET
in Azure SQL Managed Instance, see Data virtualization with Azure SQL Managed Instance. - For information and examples with serverless SQL pools in Azure Synapse, see How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics.
- Dedicated SQL pools in Azure Synapse don't support the
OPENROWSET
function.
Transact-SQL syntax conventions
Syntax for SQL Server, Azure SQL Database, and Azure SQL Managed Instance
OPENROWSET( BULK 'data_file_path',
<bulk_option> ( , <bulk_option> )*
)
[
WITH ( ( <column_name> <sql_datatype> [ '<column_path>' | <column_ordinal> ] )+ )
]
<bulk_option> ::=
DATA_SOURCE = 'data_source_name' |
-- file format options
CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' } |
DATAFILETYPE = { 'char' | 'widechar' } |
FORMAT = <file_format> |
FORMATFILE = 'format_file_path' |
FORMATFILE_DATA_SOURCE = 'data_source_name' |
SINGLE_BLOB |
SINGLE_CLOB |
SINGLE_NCLOB |
-- Text/CSV options
ROWTERMINATOR = 'row_terminator' |
FIELDTERMINATOR = 'field_terminator' |
FIELDQUOTE = 'quote_character' |
-- Error handling options
MAXERRORS = maximum_errors |
ERRORFILE = 'file_name' |
ERRORFILE_DATA_SOURCE = 'data_source_name' |
-- Execution options
FIRSTROW = first_row |
LASTROW = last_row |
ORDER ( { column [ ASC | DESC ] } [ , ...n ] ) [ UNIQUE ] ] |
ROWS_PER_BATCH = rows_per_batch
Syntax for Fabric Data Warehouse
OPENROWSET( BULK 'data_file_path',
<bulk_option> ( , <bulk_option> )*
)
[
WITH ( ( <column_name> <sql_datatype> [ '<column_path>' | <column_ordinal> ] )+ )
]
<bulk_option> ::=
-- file format options
CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' } |
DATAFILETYPE = { 'char' | 'widechar' } |
FORMAT = <file_format> |
-- Text/CSV options
ROWTERMINATOR = 'row_terminator' |
FIELDTERMINATOR = 'field_terminator' |
FIELDQUOTE = 'quote_character' |
ESCAPECHAR = 'escape_char' |
HEADER_ROW = [true|false] |
PARSER_VERSION = 'parser_version' |
-- Error handling options
MAXERRORS = maximum_errors |
ERRORFILE = 'file_name' |
-- Execution options
FIRSTROW = first_row |
LASTROW = last_row |
ROWS_PER_BATCH = rows_per_batch
Arguments
The arguments of the BULK
option allow for significant control over where to start and end reading data, how to deal with errors, and how data is interpreted. For example, you can specify that the data file is read as a single-row, single-column rowset of type varbinary, varchar, or nvarchar. The default behavior is described in the argument descriptions that follow.
For information about how to use the BULK
option, see the Remarks section later in this article. For information about the permissions that the BULK
option requires, see the Permissions section, later in this article.
For information on preparing data for bulk import, see Prepare data for bulk export or import.
BULK 'data_file_path'
The path or URI of the data file(s) whose data is to be read and returned as row set.
The URI can reference Azure Data Lake storage or Azure Blob storage. The URI of the data file(s) whose data is to be read and returned as row set.
The supported path formats are:
{drive letter}:\{file path}
to access files on local disk\\{network-share\{file path}
to access files on network sharesadls://{container}@{storage}.dfs.core.windows.net/{file path}
to access Azure Data Lake Storageabs://{storage}.blob.core.windows.net/{container}/{file path}
to access Azure Blob Storages3://{ip-address}:{port}/{file path}
to access s3-compatible storage
Note
This article and the supported URI patterns differ on different platforms. For the URI patterns that are available in Microsoft Fabric Data Warehouse, select Fabric in the version dropdown list.
Beginning with SQL Server 2017 (14.x), the data_file can be in Azure Blob Storage. For examples, see Examples of bulk access to data in Azure Blob Storage.
https://{storage}.blob.core.windows.net/[container}/{file path}
to access Azure Blob Storage or Azure Data Lake Storagehttps://{storage}.dfs.core.windows.net/[container}/{file path}
to access Azure Data Lake Storageabfss://[container}@{storage}.dfs.core.windows.net/{file path}
to access Azure Data Lake Storagehttps://onelake.dfs.fabric.microsoft.com/<workspaceId>/<lakehouseId>/Files/{file path}
- to access Fabric OneLake (currently in preview)
Note
This article and the supported URI patterns differ on different platforms. For the URI patterns that are available in SQL Server, Azure SQL Database, and Azure SQL Managed Instance, select the product in the version dropdown list.
The URI can include the *
character to match any sequence of characters, allowing OPENROWSET
to pattern-match against the URI. Additionally, it can end with /**
to enable recursive traversal through all subfolders. In SQL Server, this behavior is available beginning with SQL Server 2022 (16.x).
For example:
SELECT TOP 10 *
FROM OPENROWSET(
BULK '{scheme:}//pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/*.parquet'
);
The storage types that can be referenced by the URI are shown in the following table:
Version | On-premises | Azure storage | Fabric OneLake | S3 | Google Cloud (GCS) |
---|---|---|---|---|---|
SQL Server 2017 (14.x), SQL Server 2019 (15.x) | Yes | Yes | No | No | No |
SQL Server 2022 (16.x) | Yes | Yes | No | Yes | No |
Azure SQL Database | No | Yes | No | No | No |
Azure SQL Managed Instance | No | Yes | No | No | No |
Serverless SQL pool in Azure Synapse Analytics | No | Yes | Yes | No | No |
Microsoft Fabric Warehouse and SQL analytics endpoint | No | Yes | Yes (preview) | Yes (preview), using Fabric OneLake shortcuts | Yes (preview), using Fabric OneLake shortcuts |
You can use OPENROWSET(BULK)
to read data directly from files stored in the Fabric OneLake, specifically from the Files folder of a Fabric Lakehouse. This eliminates the need for external staging accounts (such as ADLS Gen2 or Blob Storage) and enables workspace-governed, SaaS-native ingestion using Fabric permissions. This functionality supports:
- Reading from
Files
folders in Lakehouses - Workspace-to-warehouse loads within the same tenant
- Native identity enforcement using Microsoft Entra ID
Note
Accessing Fabric OneLake storage is in preview. See the limitations that are applicable both to COPY INTO
and OPENROWSET(BULK)
.
DATA_SOURCE
DATA_SOURCE
defines the root location of the data file path. It enables you use relative paths in BULK path. The data source is created with CREATE EXTERNAL DATA SOURCE.
In addition to the root location, it can define custom credential that can be used to access the files on that location.
For example:
CREATE EXTERNAL DATA SOURCE root
WITH (LOCATION = '{scheme:}//pandemicdatalake.blob.core.windows.net/public')
GO
SELECT *
FROM OPENROWSET(
BULK '/curated/covid-19/bing_covid-19_data/latest/*.parquet',
DATA_SOURCE = 'root'
);
File format options
CODEPAGE
Specifies the code page of the data in the data file. CODEPAGE
is relevant only if the data contains char, varchar, or text columns with character values more than 127 or less than 32. The valid values are 'ACP', 'OEM', 'RAW' or 'code_page':
CODEPAGE value | Description |
---|---|
ACP |
Converts columns of char, varchar, or text data type from the ANSI/Microsoft Windows code page (ISO 1252) to the SQL Server code page. |
OEM (default) |
Converts columns of char, varchar, or text data type from the system OEM code page to the SQL Server code page. |
RAW |
No conversion occurs from one code page to another. This is the fastest option. |
code_page |
Indicates the source code page on which the character data in the data file is encoded; for example, 850. |
Important
Versions before SQL Server 2016 (13.x) don't support code page 65001 (UTF-8 encoding).
CODEPAGE
isn't a supported option on Linux.
Note
We recommend that you specify a collation name for each column in a format file, except when you want the 65001 option to have priority over the collation/code page specification.
DATAFILETYPE
Specifies that OPENROWSET(BULK)
should read single-byte (ASCII, UTF8) or multi-byte (UTF16) file content. The valid values are char and widechar:
DATAFILETYPE value | All data represented in: |
---|---|
char (default) | Character format. For more information, see Use Character Format to Import or Export Data. |
widechar | Unicode characters. For more information, see Use Unicode Character Format to Import or Export Data. |
FORMAT
Specifies the format of the referenced file, for example:
SELECT *
FROM OPENROWSET(BULK N'{data-file-path}',
FORMAT='CSV') AS cars;
The valid values are 'CSV' (comma separated values file compliant to the RFC 4180 standard), 'PARQUET', 'DELTA' (version 1.0), and 'JSONL', depending on version:
Version | CSV | PARQUET | DELTA | JSONL |
---|---|---|---|---|
SQL Server 2017 (14.x), SQL Server 2019 (15.x) | Yes | No | No | No |
SQL Server 2022 (16.x) and later versions | Yes | Yes | Yes | No |
Azure SQL Database | Yes | Yes | Yes | No |
Azure SQL Managed Instance | Yes | Yes | Yes | No |
Serverless SQL pool in Azure Synapse Analytics | Yes | Yes | Yes | No |
Microsoft Fabric Warehouse and SQL analytics endpoint | Yes | Yes | No | Yes (preview) |
Important
The OPENROWSET
function can read only newline-delimited JSON format. This feature is currently in preview.
The newline character must be used as a separator between JSON documents, and cannot be placed in the middle of a JSON document.
The FORMAT
option doesn't need to be specified if the file extension in the path ends with .csv
, .tsv
, .parquet
, .parq
, .jsonl
, .ldjson
, or .ndjson
. For example, the OPENROWSET(BULK)
function knows that the format is parquet based on extension in the following example:
SELECT *
FROM OPENROWSET(
BULK 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet'
);
If the file path doesn't end with one of these extensions, you need to specify a FORMAT
, for example:
SELECT TOP 10 *
FROM OPENROWSET(
BULK 'abfss://nyctlc@azureopendatastorage.blob.core.windows.net/yellow/**',
FORMAT='PARQUET'
)
FORMATFILE
Specifies the full path of a format file. SQL Server supports two types of format files: XML and non-XML.
SELECT TOP 10 *
FROM OPENROWSET(
BULK 'D:\XChange\test-csv.csv',
FORMATFILE= 'D:\XChange\test-format-file.xml'
)
A format file is required to define column types in the result set. The only exception is when SINGLE_CLOB
, SINGLE_BLOB
, or SINGLE_NCLOB
is specified; in which case, the format file isn't required.
For information about format files, see Use a format file to bulk import data (SQL Server).
Beginning with SQL Server 2017 (14.x), the format_file_path can be in Azure Blob Storage. For examples, see Examples of bulk access to data in Azure Blob Storage.
FORMATFILE_DATA_SOURCE
FORMATFILE_DATA_SOURCE
defines the root location of the format file path. It enables you to use relative paths in FORMATFILE option.
CREATE EXTERNAL DATA SOURCE root
WITH (LOCATION = '//pandemicdatalake/public/curated')
GO
SELECT *
FROM OPENROWSET(
BULK '//pandemicdatalake/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.csv'
FORMATFILE = 'covid-19/bing_covid-19_data/latest/bing_covid-19_data.fmt',
FORMATFILE_DATA_SOURCE = 'root'
);
The format file data source is created with CREATE EXTERNAL DATA SOURCE. In addition to the root location, it can define custom credential that can be used to access the files on that location.
Text/CSV options
ROWTERMINATOR
Specifies the row terminator to be used for char and widechar data files, for example:
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
ROWTERMINATOR = '\n'
);
The default row terminator is \r\n
(newline character). For more information, see Specify field and row terminators.
FIELDTERMINATOR
Specifies the field terminator to be used for char and widechar data files, for example:
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
FIELDTERMINATOR = '\t'
);
The default field terminator is ,
(comma). For more information, see Specify Field and Row Terminators. For example, to read tab-delimited data from a file:
FIELDQUOTE = 'field_quote'
Beginning with SQL Server 2017 (14.x), this argument specifies a character that is used as the quote character in the CSV file, like in the following New York example:
Empire State Building,40.748817,-73.985428,"20 W 34th St, New York, NY 10118","\icons\sol.png"
Statue of Liberty,40.689247,-74.044502,"Liberty Island, New York, NY 10004","\icons\sol.png"
Only a single character can be specified as the value for this option. If not specified, the quote character ("
) is used as the quote character as defined in the RFC 4180 standard. The FIELDTERMINATOR
character (for example, a comma) can be placed within the field quotes and it will be considered as a regular character in the cell wrapped with the FIELDQUOTE
characters.
For example, in order to read the previous New York sample CSV dataset, use FIELDQUOTE = '"'
. The address field's values will be retained as a single value, not split into multiple values by the commas within the "
(quote) characters.
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
FIELDQUOTE = '"'
);
PARSER_VERSION = 'parser_version'
Applies to: Fabric Data Warehouse
Specifies parser version to be used when reading files. Currently supported CSV
parser versions are 1.0 and 2.0:
- PARSER_VERSION = '1.0'
- PARSER_VERSION = '2.0'
SELECT TOP 10 *
FROM OPENROWSET(
BULK 'abfss://nyctlc@azureopendatastorage.blob.core.windows.net/yellow/**',
FORMAT='CSV',
PARSER_VERSION = '2.0'
)
CSV parser version 1.0 is default and feature rich. Version 2.0 is built for performance and doesn't support all options and encodings.
CSV parser version 1.0 specifics:
- Following options aren't supported: HEADER_ROW.
- Default terminators are
\r\n
,\n
and\r
. - If you specify
\n
(newline) as the row terminator, it is automatically prefixed with a\r
(carriage return) character, which results in a row terminator of\r\n
.
CSV parser version 2.0 specifics:
- Not all data types are supported.
- Maximum character column length is 8000.
- Maximum row size limit is 8 MB.
- Following options aren't supported:
DATA_COMPRESSION
. - Quoted empty string ("") is interpreted as empty string.
- DATEFORMAT SET option isn't honored.
- Supported format for date data type:
YYYY-MM-DD
- Supported format for time data type:
HH:MM:SS[.fractional seconds]
- Supported format for datetime2 data type:
YYYY-MM-DD HH:MM:SS[.fractional seconds]
- Default terminators are
\r\n
and\n
.
ESCAPE_CHAR = 'char'
Specifies the character in the file that is used to escape itself and all delimiter values in the file, for example:
Place,Address,Icon
Empire State Building,20 W 34th St\, New York\, NY 10118,\\icons\\sol.png
Statue of Liberty,Liberty Island\, New York\, NY 10004,\\icons\\sol.png
If the escape character is followed by a value other than itself, or any of the delimiter values, the escape character is dropped when reading the value.
The ESCAPECHAR
parameter is applied regardless of whether the FIELDQUOTE
is or isn't enabled. It won't be used to escape the quoting character. The quoting character must be escaped with another quoting character. The quoting character can appear within column value only if value is encapsulated with quoting characters.
In the following example, comma (,
) and backslash (\
) are escaped and represented as \,
and \\
:
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
ESCAPECHAR = '\'
);
HEADER_ROW = { TRUE | FALSE }
Specifies whether a CSV file contains header row that should not be returned with other data rows. An example of CSV file with a header is shown in the following example:
Place,Latitude,Longitude,Address,Area,State,Zipcode
Empire State Building,40.748817,-73.985428,20 W 34th St,New York,NY,10118
Statue of Liberty,40.689247,-74.044502,Liberty Island,New York,NY,10004
Default is FALSE
. Supported in PARSER_VERSION='2.0'
. If TRUE
, the column names will be read from the first row according to FIRSTROW
argument. If TRUE
and schema is specified using WITH
, binding of column names will be done by column name, not ordinal positions.
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
HEADER_ROW = TRUE
);
Error handling options
ERRORFILE = 'file_name'
Specifies the file used to collect rows that have formatting errors and can't be converted to an OLE DB rowset. These rows are copied into this error file from the data file "as is."
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
ERRORFILE = '{error-file-path}'
);
The error file is created at the start of the command execution. An error is raised if the file already exists. Additionally, a control file that has the extension .ERROR.txt is created. This file references each row in the error file and provides error diagnostics. After the errors are corrected, the data can be loaded.
Beginning with SQL Server 2017 (14.x), the error_file_path
can be in Azure Blob Storage.
ERRORFILE_DATA_SOURCE
Beginning with SQL Server 2017 (14.x), this argument is a named external data source pointing to the location of the error file that will contain errors found during the import.
CREATE EXTERNAL DATA SOURCE root
WITH (LOCATION = '{root-error-file-path}')
GO
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
ERRORFILE = '{relative-error-file-path}',
ERRORFILE_DATA_SOURCE = 'root'
);
For more information, see CREATE EXTERNAL DATA SOURCE (Transact-SQL).
MAXERRORS = maximum_errors
Specifies the maximum number of syntax errors or nonconforming rows, as defined in the format file, which can occur before OPENROWSET
throws an exception. Until MAXERRORS
is reached, OPENROWSET
ignores each bad row, not loading it, and counts the bad row as one error.
SELECT *
FROM OPENROWSET(
BULK '{data-file-path}',
MAXERRORS = 0
);
The default for maximum_errors is 10.
Note
MAX_ERRORS
doesn't apply to CHECK
constraints, or to converting money and bigint data types.
Data processing options
FIRSTROW = first_row
Specifies the number of the first row to load. The default is 1. This indicates the first row in the specified data file. The row numbers are determined by counting the row terminators. FIRSTROW
is 1-based.
LASTROW = last_row
Specifies the number of the last row to load. The default is 0. This indicates the last row in the specified data file.
ROWS_PER_BATCH = rows_per_batch
Specifies the approximate number of rows of data in the data file. This value is an estimate, and should be an approximation (within one order of magnitude) of the actual number of rows. By default, ROWS_PER_BATCH
is estimated based on file characteristics (number of files, file sizes, size of the returned data types). Specifying ROWS_PER_BATCH = 0
is the same as omitting ROWS_PER_BATCH
. For example:
SELECT TOP 10 *
FROM OPENROWSET(
BULK '{data-file-path}',
ROWS_PER_BATCH = 100000
);
ORDER ( { column [ ASC | DESC ] } [ ,... n ] [ UNIQUE ] )
An optional hint that specifies how the data in the data file is sorted. By default, the bulk operation assumes the data file is unordered. Performance can improve if the query optimizer can exploit the order to generate a more efficient query plan. The following list provides examples for when specifying a sort can be beneficial:
- Inserting rows into a table that has a clustered index, where the rowset data is sorted on the clustered index key.
- Joining the rowset with another table, where the sort and join columns match.
- Aggregating the rowset data by the sort columns.
- Using the rowset as a source table in the
FROM
clause of a query, where the sort and join columns match.
UNIQUE
Specifies that the data file doesn't have duplicate entries.
If the actual rows in the data file aren't sorted according to the order that is specified, or if the UNIQUE
hint is specified and duplicates keys are present, an error is returned.
Column aliases are required when ORDER
is used. The column alias list must reference the derived table that is being accessed by the BULK
clause. The column names that are specified in the ORDER
clause refer to this column alias list. Large value types (varchar(max), nvarchar(max), varbinary(max), and xml) and large object (LOB) types (text, ntext, and image) columns can't be specified.
Content options
SINGLE_BLOB
Returns the contents of data_file as a single-row, single-column rowset of type varbinary(max).
Important
We recommend that you import XML data only using the SINGLE_BLOB
option, rather than SINGLE_CLOB
and SINGLE_NCLOB
, because only SINGLE_BLOB
supports all Windows encoding conversions.
SINGLE_CLOB
By reading data_file as ASCII, returns the contents as a single-row, single-column rowset of type varchar(max), using the collation of the current database.
SINGLE_NCLOB
By reading data_file as Unicode, returns the contents as a single-row, single-column rowset of type nvarchar(max), using the collation of the current database.
SELECT * FROM OPENROWSET(
BULK N'C:\Text1.txt',
SINGLE_NCLOB
) AS Document;
WITH Schema
The WITH
schema specifies the columns that define the result set of the OPENROWSET
function. It includes column definitions for every column that will be returned as a result and outlines the mapping rules that bind the underlying file columns to the columns in the result set.
In the following example:
- The
country_region
column has varchar(50) type and referencing the underlying column with the same name - The
date
column is referencing a CSV/Parquet column or JSONL property with a different physical name - The
cases
column is referencing the third column in the file - The
fatal_cases
column is referencing a nested Parquet property or JSONL sub-object
SELECT *
FROM OPENROWSET(<...>)
WITH (
country_region varchar(50), --> country_region column has varchar(50) type and referencing the underlying column with the same name
[date] DATE '$.updated', --> date is referencing a CSV/Parquet column or JSONL property with a different physical name
cases INT 3, --> cases is referencing third column in the file
fatal_cases INT '$.statistics.deaths' --> fatal_cases is referencing a nested Parquet property or JSONL sub-object
);
<column_name>
The name of the column that will be returned in the result rowset. The data for this column is read from the underlying file column with the same name, unless overridden by <column_path>
or <column_ordinal>
. The name of the column must follow the rules for column name identifiers.
<column_type>
The T-SQL type of the column in the result set. The values from the underlying file are converted to this type when OPENROWSET
returns the results. For more information, see Data types in Fabric Warehouse.
<column_path>
A dot-separated path (for example $.description.location.lat
) used to reference nested fields in complex types like Parquet.
<column_ordinal>
A number representing the physical index of the column that will be mapped to the column in the WITH
clause.
Permissions
OPENROWSET
with external data sources, requires the following permissions:
ADMINISTER DATABASE BULK OPERATIONS
orADMINISTER BULK OPERATIONS
The following T-SQL example grants ADMINISTEER DATABASE BULK OPERATIONS
to a principal.
GRANT ADMINISTER DATABASE BULK OPERATIONS TO [<principal_name>];
If the target storage account is private, the principal must also have the Storage Blob Data Reader role (or higher) assigned at the container or storage account level.
Remarks
A
FROM
clause that is used withSELECT
can callOPENROWSET(BULK...)
instead of a table name, with fullSELECT
functionality.OPENROWSET
with theBULK
option requires a correlation name, also known as a range variable or alias, in theFROM
clause. Failure to add theAS <table_alias>
results in the error Msg 491: "A correlation name must be specified for the bulk rowset in the from clause."Column aliases can be specified. If a column alias list isn't specified, the format file must have column names. Specifying column aliases overrides the column names in the format file, such as:
FROM OPENROWSET(BULK...) AS table_alias
FROM OPENROWSET(BULK...) AS table_alias(column_alias,...n)
A
SELECT...FROM OPENROWSET(BULK...)
statement queries the data in a file directly, without importing the data into a table.A
SELECT...FROM OPENROWSET(BULK...)
statement can list bulk-column aliases by using a format file to specify column names, and also data types.
- Using
OPENROWSET(BULK...)
as a source table in anINSERT
orMERGE
statement bulk imports data from a data file into a table. For more information, see Use BULK INSERT or OPENROWSET(BULK...) to import data to SQL Server. - When the
OPENROWSET BULK
option is used with anINSERT
statement, theBULK
clause supports table hints. In addition to the regular table hints, such asTABLOCK
, theBULK
clause can accept the following specialized table hints:IGNORE_CONSTRAINTS
(ignores only theCHECK
andFOREIGN KEY
constraints),IGNORE_TRIGGERS
,KEEPDEFAULTS
, andKEEPIDENTITY
. For more information, see Table Hints (Transact-SQL). - For information about how to use
INSERT...SELECT * FROM OPENROWSET(BULK...)
statements, see Bulk Import and Export of Data (SQL Server). For information about when row-insert operations that are performed by bulk import are logged in the transaction log, see Prerequisites for minimal logging in bulk import. - When used to import data with the full recovery model,
OPENROWSET (BULK ...)
doesn't optimize logging.
Note
When you use OPENROWSET
, it's important to understand how SQL Server handles impersonation. For information about security considerations, see Use BULK INSERT or OPENROWSET(BULK...) to import data to SQL Server.
In Microsoft Fabric, the supported features are summarized in the table:
Feature | Supported | Not available |
---|---|---|
File formats | Parquet, CSV, JSONL (preview) | Delta, Azure Cosmos DB, JSON, relational databases |
Authentication | EntraID/SPN passthrough, public storage | SAS/SAK, SPN, Managed access |
Storage | Azure Blob Storage, Azure Data Lake Storage, Fabric OneLake (preview) | |
Options | Only full/absolute URI in OPENROWSET |
Relative URI path in OPENROWSET , DATA_SOURCE |
Partitioning | You can use the filepath() function in a query. |
Bulk importing SQLCHAR, SQLNCHAR, or SQLBINARY data
OPENROWSET(BULK...)
assumes that, if not specified, the maximum length of SQLCHAR
, SQLNCHAR
, or SQLBINARY
data doesn't exceed 8,000 bytes. If the data being imported is in a LOB data field that contains any varchar(max), nvarchar(max), or varbinary(max) objects that exceed 8,000 bytes, you must use an XML format file that defines the maximum length for the data field. To specify the maximum length, edit the format file and declare the MAX_LENGTH attribute.
Note
An automatically generated format file doesn't specify the length or maximum length for a LOB field. However, you can edit a format file and specify the length or maximum length manually.
Bulk exporting or importing SQLXML documents
To bulk export or import SQLXML data, use one of the following data types in your format file.
Data type | Effect |
---|---|
SQLCHAR or SQLVARYCHAR |
The data is sent in the client code page, or in the code page implied by the collation. |
SQLNCHAR or SQLNVARCHAR |
The data is sent as Unicode. |
SQLBINARY or SQLVARYBIN |
The data is sent without any conversion. |
Examples
This section provides general examples to demonstrate how to use OPENROWSET BULK
syntax.
A. Use OPENROWSET to BULK INSERT file data into a varbinary(max) column
Applies to: SQL Server only.
The following example creates a small table for demonstration purposes, and inserts file data from a file named Text1.txt
located in the C:
root directory into a varbinary(max) column.
CREATE TABLE myTable (
FileName NVARCHAR(60),
FileType NVARCHAR(60),
Document VARBINARY(MAX)
);
GO
INSERT INTO myTable (
FileName,
FileType,
Document
)
SELECT 'Text1.txt' AS FileName,
'.txt' AS FileType,
*
FROM OPENROWSET(
BULK N'C:\Text1.txt',
SINGLE_BLOB
) AS Document;
GO
B. Use the OPENROWSET BULK provider with a format file to retrieve rows from a text file
Applies to: SQL Server only.
The following example uses a format file to retrieve rows from a tab-delimited text file, values.txt
that contains the following data:
1 Data Item 1
2 Data Item 2
3 Data Item 3
The format file, values.fmt
, describes the columns in values.txt
:
9.0
2
1 SQLCHAR 0 10 "\t" 1 ID SQL_Latin1_General_Cp437_BIN
2 SQLCHAR 0 40 "\r\n" 2 Description SQL_Latin1_General_Cp437_BIN
This query retrieves that data:
SELECT a.* FROM OPENROWSET(
BULK 'C:\test\values.txt',
FORMATFILE = 'C:\test\values.fmt'
) AS a;
C. Specify a format file and code page
Applies to: SQL Server only.
The following example shows how to use both the format file and code page options at the same time.
INSERT INTO MyTable
SELECT a.* FROM OPENROWSET (
BULK N'D:\data.csv',
FORMATFILE = 'D:\format_no_collation.txt',
CODEPAGE = '65001'
) AS a;
D. Access data from a CSV file with a format file
Applies to: SQL Server 2017 (14.x) and later versions only.
SELECT * FROM OPENROWSET(
BULK N'D:\XChange\test-csv.csv',
FORMATFILE = N'D:\XChange\test-csv.fmt',
FIRSTROW = 2,
FORMAT = 'CSV'
) AS cars;
E. Access data from a CSV file without a format file
Applies to: SQL Server only.
SELECT * FROM OPENROWSET(
BULK 'C:\Program Files\Microsoft SQL Server\MSSQL14.CTP1_1\MSSQL\DATA\inv-2017-01-19.csv',
SINGLE_CLOB
) AS DATA;
SELECT *
FROM OPENROWSET('MSDASQL',
'Driver={Microsoft Access Text Driver (*.txt, *.csv)}',
'SELECT * FROM E:\Tlog\TerritoryData.csv'
);
Important
The ODBC driver should be 64-bit. Open the Drivers tab of the Connect to an ODBC Data Source (SQL Server Import and Export Wizard) application in Windows to verify this. There's 32-bit Microsoft Text Driver (*.txt, *.csv)
that will not work with a 64-bit version of sqlservr.exe
.
F. Access data from a file stored on Azure Blob Storage
Applies to: SQL Server 2017 (14.x) and later versions only.
In SQL Server 2017 (14.x) and later versions, the following example uses an external data source that points to a container in an Azure storage account and a database scoped credential created for a shared access signature.
SELECT * FROM OPENROWSET(
BULK 'inv-2017-01-19.csv',
DATA_SOURCE = 'MyAzureInvoices',
SINGLE_CLOB
) AS DataFile;
For complete OPENROWSET
examples including configuring the credential and external data source, see Examples of bulk access to data in Azure Blob Storage.
G. Import into a table from a file stored on Azure Blob Storage
The following example shows how to use the OPENROWSET
command to load data from a csv file in an Azure Blob storage location on which you created the SAS key. The Azure Blob storage location is configured as an external data source. This requires a database scoped credential using a shared access signature that is encrypted using a master key in the user database.
-- Optional: a MASTER KEY is not required if a DATABASE SCOPED CREDENTIAL is not required because the blob is configured for public (anonymous) access!
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>';
GO
-- Optional: a DATABASE SCOPED CREDENTIAL is not required because the blob is configured for public (anonymous) access!
CREATE DATABASE SCOPED CREDENTIAL MyAzureBlobStorageCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z***************';
-- Make sure that you don't have a leading ? in the SAS token, and that you
-- have at least read permission on the object that should be loaded srt=o&sp=r,
-- and that expiration period is valid (all dates are in UTC time)
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH (
TYPE = BLOB_STORAGE,
LOCATION = 'https://****************.blob.core.windows.net/curriculum',
-- CREDENTIAL is not required if a blob is configured for public (anonymous) access!
CREDENTIAL = MyAzureBlobStorageCredential
);
INSERT INTO achievements
WITH (TABLOCK) (
id,
description
)
SELECT * FROM OPENROWSET(
BULK 'csv/achievements.csv',
DATA_SOURCE = 'MyAzureBlobStorage',
FORMAT = 'CSV',
FORMATFILE = 'csv/achievements-c.xml',
FORMATFILE_DATA_SOURCE = 'MyAzureBlobStorage'
) AS DataFile;
H. Use a managed identity for an external source
Applies to: Azure SQL Managed Instance and Azure SQL Database
The following example creates a credential by using a managed identity, creates an external source and then loads data from a CSV hosted on the external source.
First, create the credential and specify blob storage as the external source:
CREATE DATABASE SCOPED CREDENTIAL sampletestcred
WITH IDENTITY = 'MANAGED IDENTITY';
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
LOCATION = 'abs://****************.blob.core.windows.net/curriculum',
CREDENTIAL = sampletestcred
);
Next, load data from the CSV file hosted on blob storage:
SELECT * FROM OPENROWSET(
BULK 'Test - Copy.csv',
DATA_SOURCE = 'SampleSource',
SINGLE_CLOB
) as test;
I. Use OPENROWSET to access several Parquet files using S3-compatible object storage
Applies to: SQL Server 2022 (16.x) and later versions.
The following example uses access several Parquet files from different location, all stored on S3-compatible object storage:
CREATE DATABASE SCOPED CREDENTIAL s3_dsc
WITH IDENTITY = 'S3 Access Key',
SECRET = 'contosoadmin:contosopwd';
GO
CREATE EXTERNAL DATA SOURCE s3_eds
WITH
(
LOCATION = 's3://10.199.40.235:9000/movies',
CREDENTIAL = s3_dsc
);
GO
SELECT * FROM OPENROWSET(
BULK (
'/decades/1950s/*.parquet',
'/decades/1960s/*.parquet',
'/decades/1970s/*.parquet'
),
FORMAT = 'PARQUET',
DATA_SOURCE = 's3_eds'
) AS data;
J. Use OPENROWSET to access several Delta tables from Azure Data Lake Gen2
Applies to: SQL Server 2022 (16.x) and later versions.
In this example, the data table container is named Contoso
, and is located on an Azure Data Lake Gen2 storage account.
CREATE DATABASE SCOPED CREDENTIAL delta_storage_dsc
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '<SAS Token>';
CREATE EXTERNAL DATA SOURCE Delta_ED
WITH (
LOCATION = 'adls://<container>@<storage_account>.dfs.core.windows.net',
CREDENTIAL = delta_storage_dsc
);
SELECT *
FROM OPENROWSET(
BULK '/Contoso',
FORMAT = 'DELTA',
DATA_SOURCE = 'Delta_ED'
) AS result;
K. Use OPENROWSET to query public-anonymous dataset
The following example uses the publicly available NYC yellow taxi trip records open data set.
Create the data source first:
CREATE EXTERNAL DATA SOURCE NYCTaxiExternalDataSource
WITH (LOCATION = 'abs://nyctlc@azureopendatastorage.blob.core.windows.net');
Query all files with .parquet
extension in folders matching name pattern:
SELECT TOP 10 *
FROM OPENROWSET(
BULK 'yellow/puYear=*/puMonth=*/*.parquet',
DATA_SOURCE = 'NYCTaxiExternalDataSource',
FORMAT = 'parquet'
) AS filerows;
A. Read a parquet file from Azure Blob Storage
In the following example you can see how to read 100 rows from a Parquet file:
SELECT TOP 100 *
FROM OPENROWSET(
BULK 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet'
);
B. Read a custom CSV file
In the following example you can see how to read rows from a CSV file with a header row and explicitly specified terminator characters that are separating rows and fields:
SELECT *
FROM OPENROWSET(
BULK 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.csv',
HEADER_ROW = TRUE,
ROW_TERMINATOR = '\n',
FIELD_TERMINATOR = ',');
C. Specify the file column schema while reading a file
In the following example you can see how to explicitly specify the schema of row that will be returned as a result of the OPENROWSET
function:
SELECT *
FROM OPENROWSET(
BULK 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.parquet')
WITH (
updated DATE
,confirmed INT
,deaths INT
,iso2 VARCHAR(8000)
,iso3 VARCHAR(8000)
);
D. Read partitioned data sets
In the following example you can see how to use the filepath()
function to read the parts of URI from the matched file path:
SELECT TOP 10
files.filepath(2) AS area
, files.*
FROM OPENROWSET(
BULK 'https://<storage account>.blob.core.windows.net/public/NYC_Property_Sales_Dataset/*_*.csv',
HEADER_ROW = TRUE)
AS files
WHERE files.filepath(1) = '2009';
E. Specify the file column schema while reading a JSONL file
In the following example, you can see how to explicitly specify the schema of row that will be returned as a result of the OPENROWSET
function:
SELECT TOP 10 *
FROM OPENROWSET(
BULK 'https://pandemicdatalake.dfs.core.windows.net/public/curated/covid-19/bing_covid-19_data/latest/bing_covid-19_data.jsonl')
WITH (
country_region varchar(50),
date DATE '$.updated',
cases INT '$.confirmed',
fatal_cases INT '$.deaths'
);
If a column name doesn't match the physical name of a column in the properties if the JSONL file, you can specify the physical name in JSON path after the type definition. You can use multiple properties. For example, $.location.latitude
to reference the nested properties in parquet complex types or JSON sub-objects.
More examples
More examples
For more examples that show using OPENROWSET(BULK...)
, see the following articles:
- Bulk Import and Export of Data (SQL Server)
- Examples of bulk import and export of XML documents (SQL Server)
- Keep identity values when bulk importing data (SQL Server)
- Keep nulls or default values during bulk import (SQL Server)
- Use a format file to bulk import data (SQL Server)
- Use character format to import or export data (SQL Server)
- Use a Format File to Skip a Table Column (SQL Server)
- Use a format file to skip a data field (SQL Server)
- Use a format file to map table columns to data-file fields (SQL Server)
- Query data sources using OPENROWSET in Azure SQL Managed Instances
- Specify field and row terminators (SQL Server)