Skip to content
This repository was archived by the owner on Jul 28, 2025. It is now read-only.

Commit e1f2272

Browse files
committed
archive
1 parent 777f6dc commit e1f2272

File tree

1 file changed

+1
-146
lines changed

1 file changed

+1
-146
lines changed

README.md

Lines changed: 1 addition & 146 deletions
Original file line numberDiff line numberDiff line change
@@ -1,148 +1,3 @@
11
## 1. Shared library plugins for Polars
22

3-
<a href="https://crates.io/crates/pyo3-polars">
4-
<img src="https://img.shields.io/crates/v/pyo3-polars.svg"/>
5-
</a>
6-
7-
Documentation for this functionality may also be found in the [Polars User Guide](https://docs.pola.rs/user-guide/plugins/).
8-
This is new functionality and should be preferred over `2.` as this
9-
will circumvent the GIL and will be the way we want to support extending polars.
10-
11-
Parallelism and optimizations are managed by the default polars runtime. That runtime will call into the plugin function.
12-
The plugin functions are compiled separately.
13-
14-
We can therefore keep polars more lean and maybe add support for a `polars-distance`, `polars-geo`, `polars-ml`, etc.
15-
Those can then have specialized expressions and don't have to worry as much for code bloat as they can be optionally installed.
16-
17-
The idea is that you define an expression in another Rust crate with a proc_macro `polars_expr`.
18-
19-
The macro may have one of the following attributes:
20-
21-
- `output_type` -> to define the output type of that expression
22-
- `output_type_func` -> to define a function that computes the output type based on input types.
23-
- `output_type_func_with_kwargs` -> to define a function that computes the output type based on input types and keyword args.
24-
25-
Here is an example of a `String` conversion expression that converts any string to [pig latin](https://en.wikipedia.org/wiki/Pig_Latin):
26-
27-
```rust
28-
fn pig_latin_str(value: &str, capitalize: bool, output: &mut String) {
29-
if let Some(first_char) = value.chars().next() {
30-
if capitalize {
31-
for c in value.chars().skip(1).map(|char| char.to_uppercase()) {
32-
write!(output, "{c}").unwrap()
33-
}
34-
write!(output, "AY").unwrap()
35-
} else {
36-
let offset = first_char.len_utf8();
37-
write!(output, "{}{}ay", &value[offset..], first_char).unwrap()
38-
}
39-
}
40-
}
41-
42-
#[derive(Deserialize)]
43-
struct PigLatinKwargs {
44-
capitalize: bool,
45-
}
46-
47-
#[polars_expr(output_type=String)]
48-
fn pig_latinnify(inputs: &[Series], kwargs: PigLatinKwargs) -> PolarsResult<Series> {
49-
let ca = inputs[0].str()?;
50-
let out: StringChunked =
51-
ca.apply_into_string_amortized(|value, output| pig_latin_str(value, kwargs.capitalize, output));
52-
Ok(out.into_series())
53-
}
54-
```
55-
56-
This can then be exposed on the Python side:
57-
58-
```python
59-
from __future__ import annotations
60-
61-
from typing import TYPE_CHECKING
62-
63-
import polars as pl
64-
from polars.plugins import register_plugin_function
65-
66-
from expression_lib._utils import LIB
67-
68-
if TYPE_CHECKING:
69-
from expression_lib._typing import IntoExprColumn
70-
71-
72-
def pig_latinnify(expr: IntoExprColumn, capitalize: bool = False) -> pl.Expr:
73-
return register_plugin_function(
74-
plugin_path=LIB,
75-
args=[expr],
76-
function_name="pig_latinnify",
77-
is_elementwise=True,
78-
kwargs={"capitalize": capitalize},
79-
)
80-
```
81-
82-
Compile/ship and then it is ready to use:
83-
84-
```python
85-
import polars as pl
86-
from expression_lib import language
87-
88-
df = pl.DataFrame({
89-
"names": ["Richard", "Alice", "Bob"],
90-
})
91-
92-
93-
out = df.with_columns(
94-
pig_latin = language.pig_latinnify("names")
95-
)
96-
```
97-
98-
Alternatively, you can [register a custom namespace](https://docs.pola.rs/py-polars/html/reference/api/polars.api.register_expr_namespace.html#polars.api.register_expr_namespace), which enables you to write:
99-
100-
```python
101-
out = df.with_columns(
102-
pig_latin = pl.col("names").language.pig_latinnify()
103-
)
104-
```
105-
106-
See the full example in [example/derive_expression]: https://github.com/pola-rs/pyo3-polars/tree/main/example/derive_expression
107-
108-
## 2. Pyo3 extensions for Polars
109-
110-
See the `example` directory for a concrete example. Here we send a polars `DataFrame` to rust and then compute a
111-
`jaccard similarity` in parallel using `rayon` and rust hash sets.
112-
113-
## Run example
114-
115-
`$ cd example && make install`
116-
`$ venv/bin/python run.py`
117-
118-
This will output:
119-
120-
```
121-
shape: (2, 2)
122-
┌───────────┬───────────────┐
123-
│ list_a ┆ list_b │
124-
│ --- ┆ --- │
125-
│ list[i64] ┆ list[i64] │
126-
╞═══════════╪═══════════════╡
127-
│ [1, 2, 3] ┆ [1, 2, ... 8] │
128-
│ [5, 5] ┆ [5, 1, 1] │
129-
└───────────┴───────────────┘
130-
shape: (2, 1)
131-
┌─────────┐
132-
│ jaccard │
133-
│ --- │
134-
│ f64 │
135-
╞═════════╡
136-
│ 0.75 │
137-
│ 0.5 │
138-
└─────────┘
139-
```
140-
141-
## Compile for release
142-
143-
`$ make install-release`
144-
145-
# What to expect
146-
147-
This crate offers a `PySeries` and a `PyDataFrame` which are simple wrapper around `Series` and `DataFrame`. The
148-
advantage of these wrappers is that they can be converted to and from python as they implement `FromPyObject` and `IntoPy`.
3+
This project has been vendored in the main Polars repo: [https://github.com/pola-rs/polars/tree/main/pyo3-polars](https://github.com/pola-rs/polars/tree/main/pyo3-polars)

0 commit comments

Comments
 (0)