Skip to content

Commit bfb29c9

Browse files
committed
Releasing 2.12.0
1 parent fe56f84 commit bfb29c9

File tree

5 files changed

+23
-23
lines changed

5 files changed

+23
-23
lines changed

CHANGELOG.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ All notable changes to this project will be documented in this file.
33

44
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
55

6-
## [UNRELEASED] - YYYY-MM-DD
6+
## [2.12.0] - 2024-04-26
77

88
## Fixes
99

README.md

+10-10
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,7 @@ The package version has the following semantics: `spark-extension_{SCALA_COMPAT_
198198
Add this line to your `build.sbt` file:
199199

200200
```sbt
201-
libraryDependencies += "uk.co.gresearch.spark" %% "spark-extension" % "2.11.0-3.5"
201+
libraryDependencies += "uk.co.gresearch.spark" %% "spark-extension" % "2.12.0-3.5"
202202
```
203203

204204
### Maven
@@ -209,7 +209,7 @@ Add this dependency to your `pom.xml` file:
209209
<dependency>
210210
<groupId>uk.co.gresearch.spark</groupId>
211211
<artifactId>spark-extension_2.12</artifactId>
212-
<version>2.11.0-3.5</version>
212+
<version>2.12.0-3.5</version>
213213
</dependency>
214214
```
215215

@@ -219,7 +219,7 @@ Add this dependency to your `build.gradle` file:
219219

220220
```groovy
221221
dependencies {
222-
implementation "uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.5"
222+
implementation "uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.5"
223223
}
224224
```
225225

@@ -228,7 +228,7 @@ dependencies {
228228
Submit your Spark app with the Spark Extension dependency (version ≥1.1.0) as follows:
229229

230230
```shell script
231-
spark-submit --packages uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.5 [jar]
231+
spark-submit --packages uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.5 [jar]
232232
```
233233

234234
Note: Pick the right Scala version (here 2.12) and Spark version (here 3.5) depending on your Spark version.
@@ -238,7 +238,7 @@ Note: Pick the right Scala version (here 2.12) and Spark version (here 3.5) depe
238238
Launch a Spark Shell with the Spark Extension dependency (version ≥1.1.0) as follows:
239239

240240
```shell script
241-
spark-shell --packages uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.5
241+
spark-shell --packages uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.5
242242
```
243243

244244
Note: Pick the right Scala version (here 2.12) and Spark version (here 3.5) depending on your Spark Shell version.
@@ -254,7 +254,7 @@ from pyspark.sql import SparkSession
254254

255255
spark = SparkSession \
256256
.builder \
257-
.config("spark.jars.packages", "uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.5") \
257+
.config("spark.jars.packages", "uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.5") \
258258
.getOrCreate()
259259
```
260260

@@ -265,7 +265,7 @@ Note: Pick the right Scala version (here 2.12) and Spark version (here 3.5) depe
265265
Launch the Python Spark REPL with the Spark Extension dependency (version ≥1.1.0) as follows:
266266

267267
```shell script
268-
pyspark --packages uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.5
268+
pyspark --packages uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.5
269269
```
270270

271271
Note: Pick the right Scala version (here 2.12) and Spark version (here 3.5) depending on your PySpark version.
@@ -275,7 +275,7 @@ Note: Pick the right Scala version (here 2.12) and Spark version (here 3.5) depe
275275
Run your Python scripts that use PySpark via `spark-submit`:
276276

277277
```shell script
278-
spark-submit --packages uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.5 [script.py]
278+
spark-submit --packages uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.5 [script.py]
279279
```
280280

281281
Note: Pick the right Scala version (here 2.12) and Spark version (here 3.5) depending on your Spark version.
@@ -289,7 +289,7 @@ Running your Python application on a Spark cluster will still require one of the
289289
to add the Scala package to the Spark environment.
290290

291291
```shell script
292-
pip install pyspark-extension==2.11.0.3.5
292+
pip install pyspark-extension==2.12.0.3.5
293293
```
294294

295295
Note: Pick the right Spark version (here 3.5) depending on your PySpark version.
@@ -299,7 +299,7 @@ Note: Pick the right Spark version (here 3.5) depending on your PySpark version.
299299
There are plenty of [Data Science notebooks](https://datasciencenotebook.org/) around. To use this library,
300300
add **a jar dependency** to your notebook using these **Maven coordinates**:
301301

302-
uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.5
302+
uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.5
303303

304304
Or [download the jar](https://mvnrepository.com/artifact/uk.co.gresearch.spark/spark-extension) and place it
305305
on a filesystem where it is accessible by the notebook, and reference that jar file directly.

pom.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
<modelVersion>4.0.0</modelVersion>
33
<groupId>uk.co.gresearch.spark</groupId>
44
<artifactId>spark-extension_2.13</artifactId>
5-
<version>2.12.0-3.5-SNAPSHOT</version>
5+
<version>2.12.0-3.5</version>
66
<name>Spark Extension</name>
77
<description>A library that provides useful extensions to Apache Spark.</description>
88
<inceptionYear>2020</inceptionYear>

python/README.md

+10-10
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,20 @@
22

33
This project provides extensions to the [Apache Spark project](https://spark.apache.org/) in Scala and Python:
44

5-
**[Diff](https://github.com/G-Research/spark-extension/blob/v2.11.0/DIFF.md):** A `diff` transformation and application for `Dataset`s that computes the differences between
5+
**[Diff](https://github.com/G-Research/spark-extension/blob/v2.12.0/DIFF.md):** A `diff` transformation and application for `Dataset`s that computes the differences between
66
two datasets, i.e. which rows to _add_, _delete_ or _change_ to get from one dataset to the other.
77

8-
**[Histogram](https://github.com/G-Research/spark-extension/blob/v2.11.0/HISTOGRAM.md):** A `histogram` transformation that computes the histogram DataFrame for a value column.
8+
**[Histogram](https://github.com/G-Research/spark-extension/blob/v2.12.0/HISTOGRAM.md):** A `histogram` transformation that computes the histogram DataFrame for a value column.
99

10-
**[Global Row Number](https://github.com/G-Research/spark-extension/blob/v2.11.0/ROW_NUMBER.md):** A `withRowNumbers` transformation that provides the global row number w.r.t.
10+
**[Global Row Number](https://github.com/G-Research/spark-extension/blob/v2.12.0/ROW_NUMBER.md):** A `withRowNumbers` transformation that provides the global row number w.r.t.
1111
the current order of the Dataset, or any given order. In contrast to the existing SQL function `row_number`, which
1212
requires a window spec, this transformation provides the row number across the entire Dataset without scaling problems.
1313

14-
**[Inspect Parquet files](https://github.com/G-Research/spark-extension/blob/v2.11.0/PARQUET.md):** The structure of Parquet files (the metadata, not the data stored in Parquet) can be inspected similar to [parquet-tools](https://pypi.org/project/parquet-tools/)
14+
**[Inspect Parquet files](https://github.com/G-Research/spark-extension/blob/v2.12.0/PARQUET.md):** The structure of Parquet files (the metadata, not the data stored in Parquet) can be inspected similar to [parquet-tools](https://pypi.org/project/parquet-tools/)
1515
or [parquet-cli](https://pypi.org/project/parquet-cli/) by reading from a simple Spark data source.
1616
This simplifies identifying why some Parquet files cannot be split by Spark into scalable partitions.
1717

18-
**[Install Python packages into PySpark job](https://github.com/G-Research/spark-extension/blob/v2.11.0/PYSPARK-DEPS.md):** Install Python dependencies via PIP or Poetry programatically into your running PySpark job (PySpark ≥ 3.1.0):
18+
**[Install Python packages into PySpark job](https://github.com/G-Research/spark-extension/blob/v2.12.0/PYSPARK-DEPS.md):** Install Python dependencies via PIP or Poetry programatically into your running PySpark job (PySpark ≥ 3.1.0):
1919

2020
```python
2121
# noinspection PyUnresolvedReferences
@@ -94,7 +94,7 @@ Running your Python application on a Spark cluster will still require one of the
9494
to add the Scala package to the Spark environment.
9595

9696
```shell script
97-
pip install pyspark-extension==2.11.0.3.4
97+
pip install pyspark-extension==2.12.0.3.4
9898
```
9999

100100
Note: Pick the right Spark version (here 3.4) depending on your PySpark version.
@@ -108,7 +108,7 @@ from pyspark.sql import SparkSession
108108

109109
spark = SparkSession \
110110
.builder \
111-
.config("spark.jars.packages", "uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.4") \
111+
.config("spark.jars.packages", "uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.4") \
112112
.getOrCreate()
113113
```
114114

@@ -119,7 +119,7 @@ Note: Pick the right Scala version (here 2.12) and Spark version (here 3.4) depe
119119
Launch the Python Spark REPL with the Spark Extension dependency (version ≥1.1.0) as follows:
120120

121121
```shell script
122-
pyspark --packages uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.4
122+
pyspark --packages uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.4
123123
```
124124

125125
Note: Pick the right Scala version (here 2.12) and Spark version (here 3.4) depending on your PySpark version.
@@ -129,7 +129,7 @@ Note: Pick the right Scala version (here 2.12) and Spark version (here 3.4) depe
129129
Run your Python scripts that use PySpark via `spark-submit`:
130130

131131
```shell script
132-
spark-submit --packages uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.4 [script.py]
132+
spark-submit --packages uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.4 [script.py]
133133
```
134134

135135
Note: Pick the right Scala version (here 2.12) and Spark version (here 3.4) depending on your Spark version.
@@ -139,7 +139,7 @@ Note: Pick the right Scala version (here 2.12) and Spark version (here 3.4) depe
139139
There are plenty of [Data Science notebooks](https://datasciencenotebook.org/) around. To use this library,
140140
add **a jar dependency** to your notebook using these **Maven coordinates**:
141141

142-
uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.4
142+
uk.co.gresearch.spark:spark-extension_2.12:2.12.0-3.4
143143

144144
Or [download the jar](https://mvnrepository.com/artifact/uk.co.gresearch.spark/spark-extension) and place it
145145
on a filesystem where it is accessible by the notebook, and reference that jar file directly.

python/setup.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
from pathlib import Path
1818
from setuptools import setup
1919

20-
jar_version = '2.12.0-3.5-SNAPSHOT'
20+
jar_version = '2.12.0-3.5'
2121
scala_version = '2.13.8'
2222
scala_compat_version = '.'.join(scala_version.split('.')[:2])
2323
spark_compat_version = jar_version.split('-')[1]

0 commit comments

Comments
 (0)