: Error during execution: Azure Data Factory Oracle connector 2.0 → Parquet write failure (ArrayIndexOutOfBounds: Index 126 out of bounds for length 38)

Question

: Error during execution: Azure Data Factory Oracle connector 2.0 → Parquet write failure (ArrayIndexOutOfBounds: Index 126 out of bounds for length 38)

Maria Gabriele Alves Ramos 0

Hello — I'm hitting a blocking error when running an ADF pipeline that reads from Oracle (using Oracle connector 2.0) and writes to Parquet. I suspect it’s related to decimal/float precision mapping, but I can't find a definitive fix. Any guidance appreciated.

ErrorCode=ParquetJavaInvocationException,

Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,

Message=An error occurred when invoking java, message: java.lang.ArrayIndexOutOfBoundsException:

Index 126 out of bounds for length 38

total entry:1

com.microsoft.datatransfer.bridge.parquet.ParquetWriterBuilderBridge.addDecimalColumn(ParquetWriterBuilderBridge.java:107)

Context:

Azure Data Factory Copy Activity (Oracle connector v2.0) → sink Parquet (ADLS).

Source DB: Oracle (TASY schema, many functions used in SELECT).

Some source numeric columns are Oracle FLOAT / NUMBER and appear to map to a very large precision (example: FLOAT(126)).

The Java Parquet writer fails when trying to create a decimal column with precision > 38.

What I already tried:

In the SELECT I forced numeric columns to CAST(... AS BINARY_DOUBLE) to force double instead of decimal.

As an alternative, I converted numbers to strings using REPLACE(TO_CHAR(...),' ,','.') to avoid decimal types in Parquet.

Ensured column mapping in Copy Activity shows Double (or String) for those columns.

Tested the SELECT directly in Oracle (works).

Temporarily wrote results to CSV (works) — so problem seems to be Parquet/decimal mapping.

Verified permissions and removed @item() dynamic references (running as a static query).

Despite these changes, I still hit the ArrayIndexOutOfBoundsException when writing Parquet in some runs (sometimes after other small adjustments).

Questions / requests:

Is this a known bug in ADF Oracle connector v2.0 (or Parquet writer) when encountering Oracle FLOAT with very large precision (e.g. FLOAT(126))?

What is the recommended way to force a safe schema for Parquet output from ADF when the source may have FLOAT/NUMBER with undefined precision? (e.g. best practice: CAST -> NUMBER(p,s) / BINARY_DOUBLE / string?)

Are there specific Copy Activity / Sink settings (Parquet format options, enable decimal conversion, or advanced mapping) that prevent the Parquet writer from trying to create DECIMAL(>38)?

Any driver / connector versions, JVM flags, or hotfixes I should apply? (I’m currently using Oracle connector v2.0).

If I must keep numeric types (not strings) for downstream consumers, what’s the safest approach to preserve numeric semantics while avoiding this error? (e.g. cast to NUMBER(38,10) in SQL, use Data Flow, use Azure Databricks job to transform before writing Parquet?)

What additional logs or diagnostic info would be helpful to share (pipeline JSON, full stack trace, sample row) to help troubleshoot?

What I can share if helpful:

Full error stack trace from ADF run.

The SELECT query used (with CASTs / REPLACE).

Pipeline + Copy Activity JSON.

A small sample of problematic values (if needed).

Thanks in advance — any pointers, known issues, or concrete config examples will help me move forward.Context:

Azure Data Factory Copy Activity (Oracle connector v2.0) → sink Parquet (ADLS).

Source DB: Oracle (TASY schema, many functions used in SELECT).

Some source numeric columns are Oracle FLOAT / NUMBER and appear to map to a very large precision (example: FLOAT(126)).

The Java Parquet writer fails when trying to create a decimal column with precision > 38.

What I already tried:

In the SELECT I forced numeric columns to CAST(... AS BINARY_DOUBLE) to force double instead of decimal.

As an alternative, I converted numbers to strings using REPLACE(TO_CHAR(...),' ,','.') to avoid decimal types in Parquet.

Ensured column mapping in Copy Activity shows Double (or String) for those columns.

Tested the SELECT directly in Oracle (works).

Temporarily wrote results to CSV (works) — so problem seems to be Parquet/decimal mapping.

Verified permissions and removed @item() dynamic references (running as a static query).

Despite these changes, I still hit the ArrayIndexOutOfBoundsException when writing Parquet in some runs (sometimes after other small adjustments).

Questions / requests:

Is this a known bug in ADF Oracle connector v2.0 (or Parquet writer) when encountering Oracle FLOAT with very large precision (e.g. FLOAT(126))?

What is the recommended way to force a safe schema for Parquet output from ADF when the source may have FLOAT/NUMBER with undefined precision? (e.g. best practice: CAST -> NUMBER(p,s) / BINARY_DOUBLE / string?)

Are there specific Copy Activity / Sink settings (Parquet format options, enable decimal conversion, or advanced mapping) that prevent the Parquet writer from trying to create DECIMAL(>38)?

Any driver / connector versions, JVM flags, or hotfixes I should apply? (I’m currently using Oracle connector v2.0).

If I must keep numeric types (not strings) for downstream consumers, what’s the safest approach to preserve numeric semantics while avoiding this error? (e.g. cast to NUMBER(38,10) in SQL, use Data Flow, use Azure Databricks job to transform before writing Parquet?)

What additional logs or diagnostic info would be helpful to share (pipeline JSON, full stack trace, sample row) to help troubleshoot?

What I can share if helpful:

Full error stack trace from ADF run.

The SELECT query used (with CASTs / REPLACE).

Pipeline + Copy Activity JSON.

A small sample of problematic values (if needed).

Thanks in advance — any pointers, known issues, or concrete config examples will help me move forward.

Smaran Thoomu 29,500 Reputation points Microsoft External Staff Moderator

2025-08-21T22:57:47.0866667+00:00

Hi Maria Gabriele Alves Ramos
This error happens because Oracle FLOAT(126) or NUMBER without precision maps to Parquet as a DECIMAL(126, …), which exceeds the Parquet limit (DECIMAL(38, s)). The Oracle connector v2.0 passes that through, and the Parquet writer fails with ArrayIndexOutOfBounds.

To resolve, you’ll need to enforce a safe type in your source query (e.g. CAST(mycol AS NUMBER(38,10)) or CAST(mycol AS BINARY_DOUBLE)), or use Data Flow to coerce the schema before writing Parquet. If downstream requires exact numerics, stick with NUMBER(38, <scale>). Otherwise, DOUBLE/STRING can also work.

There’s no Parquet sink setting that auto-corrects precision > 38, so the cast has to be handled upstream.

For more details, kindly check the official documentation: Data type mapping for Oracle

I hope this information helps. Please do let us know if you have any further queries.

Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.
Smaran Thoomu 29,500 Reputation points Microsoft External Staff Moderator

2025-08-25T18:03:27.8633333+00:00

Maria Gabriele Alves Ramos We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Smaran Thoomu 29,500 Reputation points Microsoft External Staff Moderator

2025-08-26T16:22:14.95+00:00

Maria Gabriele Alves Ramos We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Maria Gabriele Alves Ramos 0 Reputation points

2025-08-26T16:50:00.6666667+00:00

I encountered the same issue when using the Oracle connector in Azure Data Factory. To resolve it, I used the CAST function with the NUMBER type, which is standard in Oracle. However, it's important to explicitly define the precision and scale when casting, like CAST(column AS NUMBER(15,2)).

If you use CAST(column AS NUMBER) without specifying precision, the error may still occur, especially when writing to formats like Parquet. Defining the precision ensures that ADF can correctly map the data types during transformation and sink operations.

Let me know if you need an example or further clarification!

1 answer

Your answer

Smaran Thoomu 29,500 Reputation points Microsoft External Staff Moderator

2025-08-21T22:57:47.0866667+00:00

Hi Maria Gabriele Alves Ramos
This error happens because Oracle FLOAT(126) or NUMBER without precision maps to Parquet as a DECIMAL(126, …), which exceeds the Parquet limit (DECIMAL(38, s)). The Oracle connector v2.0 passes that through, and the Parquet writer fails with ArrayIndexOutOfBounds.

To resolve, you’ll need to enforce a safe type in your source query (e.g. CAST(mycol AS NUMBER(38,10)) or CAST(mycol AS BINARY_DOUBLE)), or use Data Flow to coerce the schema before writing Parquet. If downstream requires exact numerics, stick with NUMBER(38, <scale>). Otherwise, DOUBLE/STRING can also work.

There’s no Parquet sink setting that auto-corrects precision > 38, so the cast has to be handled upstream.

For more details, kindly check the official documentation: Data type mapping for Oracle

I hope this information helps. Please do let us know if you have any further queries.

Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.
Smaran Thoomu 29,500 Reputation points Microsoft External Staff Moderator

2025-08-25T18:03:27.8633333+00:00

Maria Gabriele Alves Ramos We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Smaran Thoomu 29,500 Reputation points Microsoft External Staff Moderator

2025-08-26T16:22:14.95+00:00

Maria Gabriele Alves Ramos We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Maria Gabriele Alves Ramos 0 Reputation points

2025-08-26T16:50:00.6666667+00:00

I encountered the same issue when using the Oracle connector in Azure Data Factory. To resolve it, I used the CAST function with the NUMBER type, which is standard in Oracle. However, it's important to explicitly define the precision and scale when casting, like CAST(column AS NUMBER(15,2)).

If you use CAST(column AS NUMBER) without specifying precision, the error may still occur, especially when writing to formats like Parquet. Defining the precision ensures that ADF can correctly map the data types during transformation and sink operations.

Let me know if you need an example or further clarification!

Answer 1

Maria Gabriele Alves Ramos 0

I encountered the same issue when using the Oracle connector in Azure Data Factory. To resolve it, I used the CAST function with the NUMBER type, which is standard in Oracle. However, it's important to explicitly define the precision and scale when casting, like CAST(column AS NUMBER(15,2)).

If you use CAST(column AS NUMBER) without specifying precision, the error may still occur, especially when writing to formats like Parquet. Defining the precision ensures that ADF can correctly map the data types during transformation and sink operations.

Let me know if you need an example or further clarification!

Smaran Thoomu 29,500 Reputation points Microsoft External Staff Moderator

2025-08-26T19:06:42.0066667+00:00

Maria Gabriele Alves Ramos Appreciate you sharing your solution back with the community - it will definitely help others running into this.

Share via

: Error during execution: Azure Data Factory Oracle connector 2.0 → Parquet write failure (ArrayIndexOutOfBounds: Index 126 out of bounds for length 38)

1 answer

Your answer