{"diffoscope-json-version": 1, "source1": "/srv/reproducible-results/rbuild-debian/r-b-build.VVYSiNm4/b1/python-xarray_2025.01.2-1_i386.changes", "source2": "/srv/reproducible-results/rbuild-debian/r-b-build.VVYSiNm4/b2/python-xarray_2025.01.2-1_i386.changes", "unified_diff": null, "details": [{"source1": "Files", "source2": "Files", "unified_diff": "@@ -1,3 +1,3 @@\n \n- a8ccd70b6f7276b9c4bd3d1828502794 2009904 doc optional python-xarray-doc_2025.01.2-1_all.deb\n+ 2ff013168fb85277456a834d7bd1d87d 2010624 doc optional python-xarray-doc_2025.01.2-1_all.deb\n 4644c3352e568f782718f7d018211ae7 799852 python optional python3-xarray_2025.01.2-1_all.deb\n"}, {"source1": "python-xarray-doc_2025.01.2-1_all.deb", "source2": "python-xarray-doc_2025.01.2-1_all.deb", "unified_diff": null, "details": [{"source1": "file list", "source2": "file list", "unified_diff": "@@ -1,3 +1,3 @@\n -rw-r--r-- 0 0 0 4 2025-02-02 11:36:57.000000 debian-binary\n--rw-r--r-- 0 0 0 6168 2025-02-02 11:36:57.000000 control.tar.xz\n--rw-r--r-- 0 0 0 2003544 2025-02-02 11:36:57.000000 data.tar.xz\n+-rw-r--r-- 0 0 0 6172 2025-02-02 11:36:57.000000 control.tar.xz\n+-rw-r--r-- 0 0 0 2004260 2025-02-02 11:36:57.000000 data.tar.xz\n"}, {"source1": "control.tar.xz", "source2": "control.tar.xz", "unified_diff": null, "details": [{"source1": "control.tar", "source2": "control.tar", "unified_diff": null, "details": [{"source1": "./control", "source2": "./control", "unified_diff": "@@ -1,13 +1,13 @@\n Package: python-xarray-doc\n Source: python-xarray\n Version: 2025.01.2-1\n Architecture: all\n Maintainer: Debian Science Maintainers Visualizing your datasets is quick and convenient: Note the automatic labeling with names and units. Our effort in adding metadata attributes has paid off! Many aspects of these figures are customizable: see Plotting. Note This method replicates the behavior of Plotting\u00b6
\n In [37]: data.plot()\n-Out[37]: <matplotlib.collections.QuadMesh at 0xe83d1a40>\n+Out[37]: <matplotlib.collections.QuadMesh at 0xe80d0a40>\n
\n
pandas\u00b6
\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -253,15 +253,15 @@\n [0.37342613, 1.49497537, 1.33584385]])\n Coordinates:\n * x (x) int32 8B 10 20\n Dimensions without coordinates: y\n *\b**\b**\b**\b**\b* P\bPl\blo\bot\btt\bti\bin\bng\bg_\b?\b\u00b6 *\b**\b**\b**\b**\b*\n Visualizing your datasets is quick and convenient:\n In [37]: data.plot()\n-Out[37]: apply_ufunc
\", \"Compare weighted and unweighted mean temperature\", \"Blank template\", \"Calculating Seasonal Averages from Time Series of Monthly Means\", \"Working with Multidimensional Coordinates\", \"Visualization Gallery\", \"Toy weather data\", \"Gallery\", \"Frequently Asked Questions\", \"Getting Started\", \"Installation\", \"Quick overview\", \"Overview: Why xarray?\", \"Getting Help\", \"How do I \\u2026\", \"Xarray documentation\", \"Alternative chunked array types\", \"Integrating with duck arrays\", \"Extending xarray using accessors\", \"How to add a new backend\", \"How to create a custom index\", \"Xarray Internals\", \"Internal Design\", \"Interoperability of Xarray\", \"Zarr Encoding Specification\", \"Development roadmap\", \"Tutorials and Videos\", \"Combining data\", \"Computation\", \"Parallel Computing with Dask\", \"Data Structures\", \"Working with numpy-like arrays\", \"GroupBy: Group and Bin Data\", \"Hierarchical data\", \"User Guide\", \"Indexing and selecting data\", \"Interpolating data\", \"Reading and writing files\", \"Configuration\", \"Working with pandas\", \"Plotting\", \"Reshaping and reorganizing data\", \"Terminology\", \"Testing your code\", \"Time series data\", \"Weather and climate data\", \"What\\u2019s New\"],\n \"titleterms\": {\n \"\": [13, 54],\n \"0\": 54,\n \"01\": 54,\n \"02\": 54,\n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/computation.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/computation.html", "unified_diff": "@@ -934,16 +934,16 @@\n <xarray.Dataset> Size: 2kB\n Dimensions: (param: 10, cov_i: 10, cov_j: 10)\n Coordinates:\n * param (param) <U7 280B 'a0' 'xc0' ... 'xalpha1' 'yalpha1'\n * cov_i (cov_i) <U7 280B 'a0' 'xc0' ... 'xalpha1' 'yalpha1'\n * cov_j (cov_j) <U7 280B 'a0' 'xc0' ... 'xalpha1' 'yalpha1'\n Data variables:\n- curvefit_coefficients (param) float64 80B 3.0 1.004 1.003 ... 1.007 1.008\n- curvefit_covariance (cov_i, cov_j) float64 800B 3.362e-05 ... 2.125e-05\n+ curvefit_coefficients (param) float64 80B 1.994 -0.9986 ... 1.999 0.9986\n+ curvefit_covariance (cov_i, cov_j) float64 800B 6.556e-05 ... 4.467e-06\n \n \n scipy.optimize.curve_fit()
.func(ds)
). This allows you to write pipelines for\n transforming your data (using \u201cmethod chaining\u201d) instead of writing hard to\n follow nested function calls:
# these lines are equivalent, but with pipe we can make the logic flow\n # entirely from left to right\n In [64]: plt.plot((2 * ds.temperature.sel(loc=0)).mean("instrument"))\n-Out[64]: [<matplotlib.lines.Line2D at 0xe42f7870>]\n+Out[64]: [<matplotlib.lines.Line2D at 0xe3ff3870>]\n \n In [65]: (ds.temperature.sel(loc=0).pipe(lambda x: 2 * x).mean("instrument").pipe(plt.plot))\n-Out[65]: [<matplotlib.lines.Line2D at 0xe42f7710>]\n+Out[65]: [<matplotlib.lines.Line2D at 0xe3ff3710>]\n
Both pipe
and assign
replicate the pandas methods of the same names\n (DataFrame.pipe
and\n DataFrame.assign
).
With xarray, there is no performance penalty for creating new datasets, even if\n variables are lazily loaded from a file on disk. Creating new objects instead\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -585,19 +585,19 @@\n There is also the pipe() method that allows you to use a method call with an\n external function (e.g., ds.pipe(func)) instead of simply calling it (e.g.,\n func(ds)). This allows you to write pipelines for transforming your data (using\n \u201cmethod chaining\u201d) instead of writing hard to follow nested function calls:\n # these lines are equivalent, but with pipe we can make the logic flow\n # entirely from left to right\n In [64]: plt.plot((2 * ds.temperature.sel(loc=0)).mean(\"instrument\"))\n-Out[64]: [ If you were a previous user of the prototype xarray-contrib/datatree package, this is different from what you\u2019re used to!\n In that package the data model was that the data stored in each node actually was completely unrelated. The data model is now slightly stricter.\n This allows us to provide features like Coordinate Inheritance. To demonstrate, let\u2019s first generate some example datasets which are not aligned with one another: Now we have a valid This is a useful way to organise our data because we can still operate on all the groups at once.\n For example we can extract all three timeseries at a specific lat-lon location: or compute the standard deviation of each timeseries to find out how it varies with sampling frequency: This helps to differentiate which variables are defined on the datatree node that you are currently looking at, and which were defined somewhere above it. We can also still perform all the same operations on the whole tree:# (drop the attributes just to make the printed representation shorter)\n In [89]: ds = xr.tutorial.open_dataset("air_temperature").drop_attrs()\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [90]: ds_daily = ds.resample(time="D").mean("time")\n KeyError: "No variable named 'time'. Variables on the dataset include ['foo', 'x', 'letters']"\n \n \n In [91]: ds_weekly = ds.resample(time="W").mean("time")\n@@ -1054,15 +1054,15 @@\n \u2514\u2500\u2500 Group: /b/B\n
DataTree
structure which contains all the data at each different time frequency, stored in a separate group.In [100]: dt.sel(lat=75, lon=300)\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of set()\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of set()\n
In [101]: dt.std(dim="time")\n ValueError: Dimension(s) 'time' do not exist. Expected one or more of set()\n
In [107]: print(dt["/daily"])\n KeyError: 'Could not find node at /daily'\n
In [108]: dt.sel(lat=[75], lon=[300])\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of set()\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of set()\n \n \n In [109]: dt.std(dim="time")\n ValueError: Dimension(s) 'time' do not exist. Expected one or more of set()\n
In [52]: ds = xr.tutorial.open_dataset("air_temperature")\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n # Define target latitude and longitude (where weather stations might be)\n In [53]: target_lon = xr.DataArray([200, 201, 202, 205], dims="points")\n \n In [54]: target_lat = xr.DataArray([31, 41, 42, 42], dims="points")\n \n@@ -697,15 +697,15 @@\n
To select and assign values to a portion of a DataArray()
you\n can use indexing with .loc
:
In [57]: ds = xr.tutorial.open_dataset("air_temperature")\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n # add an empty 2D dataarray\n In [58]: ds["empty"] = xr.full_like(ds.air.mean("time"), fill_value=0)\n AttributeError: 'Dataset' object has no attribute 'air'\n \n \n@@ -869,15 +869,15 @@\n
You can also assign values to all variables of a Dataset
at once:
In [83]: ds_org = xr.tutorial.open_dataset("eraint_uvz").isel(\n ....: latitude=slice(56, 59), longitude=slice(255, 258), level=0\n ....: )\n ....: \n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n # set all values to 0\n In [84]: ds = xr.zeros_like(ds_org)\n NameError: name 'ds_org' is not defined\n \n \n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -474,15 +474,15 @@\n collection specified weather station latitudes and longitudes. To trigger\n vectorized indexing behavior you will need to provide the selection dimensions\n with a new shared output dimension name. In the example below, the selections\n of the closest latitude and longitude are renamed to an output dimension named\n \u201cpoints\u201d:\n In [52]: ds = xr.tutorial.open_dataset(\"air_temperature\")\n PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not\n-create data cache folder '/nonexistent/first-build/.cache/\n+create data cache folder '/nonexistent/second-build/.cache/\n xarray_tutorial_data'. Will not be able to download data files.\n \n \n # Define target latitude and longitude (where weather stations might be)\n In [53]: target_lon = xr.DataArray([200, 201, 202, 205], dims=\"points\")\n \n In [54]: target_lat = xr.DataArray([31, 41, 42, 42], dims=\"points\")\n@@ -513,15 +513,15 @@\n selected subpart of the target array (except for the explicitly indexed\n dimensions with .loc/.sel). Otherwise, IndexError will be raised.\n *\b**\b**\b**\b**\b* A\bAs\bss\bsi\big\bgn\bni\bin\bng\bg v\bva\bal\blu\bue\bes\bs w\bwi\bit\bth\bh i\bin\bnd\bde\bex\bxi\bin\bng\bg_\b?\b\u00b6 *\b**\b**\b**\b**\b*\n To select and assign values to a portion of a DataArray() you can use indexing\n with .loc :\n In [57]: ds = xr.tutorial.open_dataset(\"air_temperature\")\n PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not\n-create data cache folder '/nonexistent/first-build/.cache/\n+create data cache folder '/nonexistent/second-build/.cache/\n xarray_tutorial_data'. Will not be able to download data files.\n \n \n # add an empty 2D dataarray\n In [58]: ds[\"empty\"] = xr.full_like(ds.air.mean(\"time\"), fill_value=0)\n AttributeError: 'Dataset' object has no attribute 'air'\n \n@@ -673,15 +673,15 @@\n Dimensions without coordinates: x\n You can also assign values to all variables of a Dataset at once:\n In [83]: ds_org = xr.tutorial.open_dataset(\"eraint_uvz\").isel(\n ....: latitude=slice(56, 59), longitude=slice(255, 258), level=0\n ....: )\n ....:\n PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not\n-create data cache folder '/nonexistent/first-build/.cache/\n+create data cache folder '/nonexistent/second-build/.cache/\n xarray_tutorial_data'. Will not be able to download data files.\n \n \n # set all values to 0\n In [84]: ds = xr.zeros_like(ds_org)\n NameError: name 'ds_org' is not defined\n \n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/interpolation.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/interpolation.html", "unified_diff": "@@ -237,24 +237,24 @@\n ....: np.sin(np.linspace(0, 2 * np.pi, 10)),\n ....: dims="x",\n ....: coords={"x": np.linspace(0, 1, 10)},\n ....: )\n ....: \n \n In [17]: da.plot.line("o", label="original")\n-Out[17]: [<matplotlib.lines.Line2D at 0xe8349df0>]\n+Out[17]: [<matplotlib.lines.Line2D at 0xe8043df0>]\n \n In [18]: da.interp(x=np.linspace(0, 1, 100)).plot.line(label="linear (default)")\n-Out[18]: [<matplotlib.lines.Line2D at 0xe4b96df0>]\n+Out[18]: [<matplotlib.lines.Line2D at 0xe48a1df0>]\n \n In [19]: da.interp(x=np.linspace(0, 1, 100), method="cubic").plot.line(label="cubic")\n-Out[19]: [<matplotlib.lines.Line2D at 0xe2c9d870>]\n+Out[19]: [<matplotlib.lines.Line2D at 0xe213cc90>]\n \n In [20]: plt.legend()\n-Out[20]: <matplotlib.legend.Legend at 0xe83443c8>\n+Out[20]: <matplotlib.legend.Legend at 0xe803d3c8>\n
Additional keyword arguments can be passed to scipy\u2019s functions.
\n# fill 0 for the outside of the original coordinates.\n In [21]: da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": 0.0})\n@@ -439,15 +439,15 @@\n see Missing values.\n \n \n Example\u00b6
\n Let\u2019s see how interp()
works on real data.
\n # Raw data\n In [44]: ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [45]: fig, axes = plt.subplots(ncols=2, figsize=(10, 4))\n \n In [46]: ds.air.plot(ax=axes[0])\n AttributeError: 'Dataset' object has no attribute 'air'\n \n@@ -511,15 +511,15 @@\n ....: axes[0].plot(*xr.broadcast(lon.isel(z=idx), lat.isel(z=idx)), "--k")\n ....: \n \n In [61]: axes[0].set_title("Raw data")\n Out[61]: Text(0.5, 1.0, 'Raw data')\n \n In [62]: dsi = ds.interp(lon=lon, lat=lat)\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of FrozenMappingWarningOnValuesAccess({'x': 3, 'y': 4})\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of FrozenMappingWarningOnValuesAccess({'x': 3, 'y': 4})\n \n \n In [63]: dsi.air.plot(ax=axes[1])\n NameError: name 'dsi' is not defined\n \n \n In [64]: axes[1].set_title("Remapped data")\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -154,26 +154,26 @@\n ....: np.sin(np.linspace(0, 2 * np.pi, 10)),\n ....: dims=\"x\",\n ....: coords={\"x\": np.linspace(0, 1, 10)},\n ....: )\n ....:\n \n In [17]: da.plot.line(\"o\", label=\"original\")\n-Out[17]: []\n+Out[17]: []\n \n In [18]: da.interp(x=np.linspace(0, 1, 100)).plot.line(label=\"linear\n (default)\")\n-Out[18]: []\n+Out[18]: []\n \n In [19]: da.interp(x=np.linspace(0, 1, 100), method=\"cubic\").plot.line\n (label=\"cubic\")\n-Out[19]: []\n+Out[19]: []\n \n In [20]: plt.legend()\n-Out[20]: \n+Out[20]: \n _\b[_\b__\bb_\bu_\bi_\bl_\bd_\b/_\bh_\bt_\bm_\bl_\b/_\b__\bs_\bt_\ba_\bt_\bi_\bc_\b/_\bi_\bn_\bt_\be_\br_\bp_\bo_\bl_\ba_\bt_\bi_\bo_\bn_\b__\bs_\ba_\bm_\bp_\bl_\be_\b1_\b._\bp_\bn_\bg_\b]\n Additional keyword arguments can be passed to scipy\u2019s functions.\n # fill 0 for the outside of the original coordinates.\n In [21]: da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={\"fill_value\": 0.0})\n Out[21]:\n Size: 80B\n array([ 0. , 0. , 0. , 0.814, 0.604, -0.604, -0.814, 0. , 0. ,\n@@ -337,15 +337,15 @@\n * x (x) float64 24B 0.5 1.5 2.5\n For the details of interpolate_na(), see _\bM_\bi_\bs_\bs_\bi_\bn_\bg_\b _\bv_\ba_\bl_\bu_\be_\bs.\n *\b**\b**\b**\b**\b* E\bEx\bxa\bam\bmp\bpl\ble\be_\b?\b\u00b6 *\b**\b**\b**\b**\b*\n Let\u2019s see how interp() works on real data.\n # Raw data\n In [44]: ds = xr.tutorial.open_dataset(\"air_temperature\").isel(time=0)\n PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not\n-create data cache folder '/nonexistent/first-build/.cache/\n+create data cache folder '/nonexistent/second-build/.cache/\n xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [45]: fig, axes = plt.subplots(ncols=2, figsize=(10, 4))\n \n In [46]: ds.air.plot(ax=axes[0])\n AttributeError: 'Dataset' object has no attribute 'air'\n@@ -410,15 +410,15 @@\n k\")\n ....:\n \n In [61]: axes[0].set_title(\"Raw data\")\n Out[61]: Text(0.5, 1.0, 'Raw data')\n \n In [62]: dsi = ds.interp(lon=lon, lat=lat)\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of\n FrozenMappingWarningOnValuesAccess({'x': 3, 'y': 4})\n \n \n In [63]: dsi.air.plot(ax=axes[1])\n NameError: name 'dsi' is not defined\n \n \n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/io.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/io.html", "unified_diff": "@@ -630,15 +630,15 @@\n ....: "y": pd.date_range("2000-01-01", periods=5),\n ....: "z": ("x", list("abcd")),\n ....: },\n ....: )\n ....: \n \n In [13]: ds.to_zarr("path/to/directory.zarr")\n-Out[13]: <xarray.backends.zarr.ZarrStore at 0xe1baf8e0>\n+Out[13]: <xarray.backends.zarr.ZarrStore at 0xe1849850>\n
\n \n (The suffix .zarr
is optional\u2013just a reminder that a zarr store lives\n there.) If the directory does not exist, it will be created. If a zarr\n store is already present at that path, an error will be raised, preventing it\n from being overwritten. To override this behavior and overwrite an existing\n store, add mode='w'
when invoking to_zarr()
.
\n@@ -658,19 +658,19 @@\n To read back a zarr dataset that has been created this way, we use the\n open_zarr()
method:
\n In [14]: ds_zarr = xr.open_zarr("path/to/directory.zarr")\n \n In [15]: ds_zarr\n Out[15]: \n <xarray.Dataset> Size: 232B\n-Dimensions: (x: 4, y: 5)\n+Dimensions: (y: 5, x: 4)\n Coordinates:\n- z (x) object 16B dask.array<chunksize=(4,), meta=np.ndarray>\n * y (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05\n * x (x) int32 16B 10 20 30 40\n+ z (x) object 16B dask.array<chunksize=(4,), meta=np.ndarray>\n Data variables:\n foo (x, y) float64 160B dask.array<chunksize=(4, 5), meta=np.ndarray>\n
\n \n \n Cloud Storage Buckets\u00b6
\n It is possible to read and write xarray datasets directly from / to cloud\n@@ -724,36 +724,36 @@\n \n In [18]: ds = xr.Dataset({"foo": ("x", dummies)}, coords={"x": np.arange(30)})\n \n In [19]: path = "path/to/directory.zarr"\n \n # Now we write the metadata without computing any array values\n In [20]: ds.to_zarr(path, compute=False)\n-Out[20]: Delayed('_finalize_store-869d28c3-3cb4-41c1-be67-7ddf964a11d9')\n+Out[20]: Delayed('_finalize_store-5a3251a1-b9f6-41d9-a554-1b45e9c096d5')\n
Now, a Zarr store with the correct variable shapes and attributes exists that\n can be filled out by subsequent calls to to_zarr
.\n Setting region="auto"
will open the existing store and determine the\n correct alignment of the new data with the existing dimensions, or as an\n explicit mapping from dimension names to Python slice
objects indicating\n where the data should be written (in index space, not label space), e.g.,
# For convenience, we'll slice a single dataset, but in the real use-case\n # we would create them separately possibly even from separate processes.\n In [21]: ds = xr.Dataset({"foo": ("x", np.arange(30))}, coords={"x": np.arange(30)})\n \n # Any of the following region specifications are valid\n In [22]: ds.isel(x=slice(0, 10)).to_zarr(path, region="auto")\n-Out[22]: <xarray.backends.zarr.ZarrStore at 0xe1e39388>\n+Out[22]: <xarray.backends.zarr.ZarrStore at 0xe1b0f3d0>\n \n In [23]: ds.isel(x=slice(10, 20)).to_zarr(path, region={"x": "auto"})\n-Out[23]: <xarray.backends.zarr.ZarrStore at 0xe1e39c40>\n+Out[23]: <xarray.backends.zarr.ZarrStore at 0xe1b0fcd0>\n \n In [24]: ds.isel(x=slice(20, 30)).to_zarr(path, region={"x": slice(20, 30)})\n-Out[24]: <xarray.backends.zarr.ZarrStore at 0xe239f3d0>\n+Out[24]: <xarray.backends.zarr.ZarrStore at 0xe20762b0>\n
Concurrent writes with region
are safe as long as they modify distinct\n chunks in the underlying Zarr arrays (or use an appropriate lock
).
As a safety check to make it harder to inadvertently override existing values,\n if you set region
then all variables included in a Dataset must have\n dimensions included in region
. Other variables (typically coordinates)\n@@ -816,28 +816,28 @@\n ....: "y": [1, 2, 3, 4, 5],\n ....: "t": pd.date_range("2001-01-01", periods=2),\n ....: },\n ....: )\n ....: \n \n In [30]: ds1.to_zarr("path/to/directory.zarr")\n-Out[30]: <xarray.backends.zarr.ZarrStore at 0xe1e45f10>\n+Out[30]: <xarray.backends.zarr.ZarrStore at 0xe1b31070>\n \n In [31]: ds2 = xr.Dataset(\n ....: {"foo": (("x", "y", "t"), np.random.rand(4, 5, 2))},\n ....: coords={\n ....: "x": [10, 20, 30, 40],\n ....: "y": [1, 2, 3, 4, 5],\n ....: "t": pd.date_range("2001-01-03", periods=2),\n ....: },\n ....: )\n ....: \n \n In [32]: ds2.to_zarr("path/to/directory.zarr", append_dim="t")\n-Out[32]: <xarray.backends.zarr.ZarrStore at 0xe1e454f0>\n+Out[32]: <xarray.backends.zarr.ZarrStore at 0xe1b0a898>\n
Chunk sizes may be specified in one of three ways when writing to a zarr store:
\nFor example, let\u2019s say we\u2019re working with a dataset with dimensions\n ('time', 'x', 'y')
, a variable Tair
which is chunked in x
and y
,\n and two multi-dimensional coordinates xc
and yc
:
In [33]: ds = xr.tutorial.open_dataset("rasm")\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [34]: ds["Tair"] = ds["Tair"].chunk({"x": 100, "y": 100})\n KeyError: "No variable named 'Tair'. Variables on the dataset include ['foo', 'x']"\n \n \n In [35]: ds\n@@ -882,15 +882,15 @@\n foo (x) int32 120B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29\n
These multi-dimensional coordinates are only two-dimensional and take up very little\n space on disk or in memory, yet when writing to disk the default zarr behavior is to\n split them into chunks:
\nIn [36]: ds.to_zarr("path/to/directory.zarr", mode="w")\n-Out[36]: <xarray.backends.zarr.ZarrStore at 0xe1df2190>\n+Out[36]: <xarray.backends.zarr.ZarrStore at 0xe1ac24f0>\n \n In [37]: ! ls -R path/to/directory.zarr\n path/to/directory.zarr:\n foo x\tzarr.json\n \n path/to/directory.zarr/foo:\n c zarr.json\n@@ -1069,15 +1069,15 @@\n Ncdata\u00b6
\n Ncdata provides more sophisticated means of transferring data, including entire\n datasets. It uses the file saving and loading functions in both projects to provide a\n more \u201ccorrect\u201d translation between them, but still with very low overhead and not\n using actual disk files.
\n For example:
\n In [48]: ds = xr.tutorial.open_dataset("air_temperature_gradient")\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [49]: cubes = ncdata.iris_xarray.cubes_from_xarray(ds)\n NameError: name 'ncdata' is not defined\n \n \n In [50]: print(cubes)\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -481,15 +481,15 @@\n ....: \"y\": pd.date_range(\"2000-01-01\", periods=5),\n ....: \"z\": (\"x\", list(\"abcd\")),\n ....: },\n ....: )\n ....:\n \n In [13]: ds.to_zarr(\"path/to/directory.zarr\")\n-Out[13]: \n+Out[13]: \n (The suffix .zarr is optional\u2013just a reminder that a zarr store lives there.)\n If the directory does not exist, it will be created. If a zarr store is already\n present at that path, an error will be raised, preventing it from being\n overwritten. To override this behavior and overwrite an existing store, add\n mode='w' when invoking to_zarr().\n DataArrays can also be saved to disk using the DataArray.to_zarr() method, and\n loaded from disk using the open_dataarray() function with engine='zarr'.\n@@ -505,19 +505,19 @@\n To read back a zarr dataset that has been created this way, we use the\n open_zarr() method:\n In [14]: ds_zarr = xr.open_zarr(\"path/to/directory.zarr\")\n \n In [15]: ds_zarr\n Out[15]:\n Size: 232B\n-Dimensions: (x: 4, y: 5)\n+Dimensions: (y: 5, x: 4)\n Coordinates:\n- z (x) object 16B dask.array\n * y (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05\n * x (x) int32 16B 10 20 30 40\n+ z (x) object 16B dask.array\n Data variables:\n foo (x, y) float64 160B dask.array\n *\b**\b**\b**\b* C\bCl\blo\bou\bud\bd S\bSt\bto\bor\bra\bag\bge\be B\bBu\buc\bck\bke\bet\bts\bs_\b?\b\u00b6 *\b**\b**\b**\b*\n It is possible to read and write xarray datasets directly from / to cloud\n storage buckets using zarr. This example uses the _\bg_\bc_\bs_\bf_\bs package to provide an\n interface to _\bG_\bo_\bo_\bg_\bl_\be_\b _\bC_\bl_\bo_\bu_\bd_\b _\bS_\bt_\bo_\br_\ba_\bg_\be.\n General _\bf_\bs_\bs_\bp_\be_\bc URLs, those that begin with s3:// or gcs:// for example, are\n@@ -562,35 +562,35 @@\n \n In [18]: ds = xr.Dataset({\"foo\": (\"x\", dummies)}, coords={\"x\": np.arange(30)})\n \n In [19]: path = \"path/to/directory.zarr\"\n \n # Now we write the metadata without computing any array values\n In [20]: ds.to_zarr(path, compute=False)\n-Out[20]: Delayed('_finalize_store-869d28c3-3cb4-41c1-be67-7ddf964a11d9')\n+Out[20]: Delayed('_finalize_store-5a3251a1-b9f6-41d9-a554-1b45e9c096d5')\n Now, a Zarr store with the correct variable shapes and attributes exists that\n can be filled out by subsequent calls to to_zarr. Setting region=\"auto\" will\n open the existing store and determine the correct alignment of the new data\n with the existing dimensions, or as an explicit mapping from dimension names to\n Python slice objects indicating where the data should be written (in index\n space, not label space), e.g.,\n # For convenience, we'll slice a single dataset, but in the real use-case\n # we would create them separately possibly even from separate processes.\n In [21]: ds = xr.Dataset({\"foo\": (\"x\", np.arange(30))}, coords={\"x\": np.arange\n (30)})\n \n # Any of the following region specifications are valid\n In [22]: ds.isel(x=slice(0, 10)).to_zarr(path, region=\"auto\")\n-Out[22]: \n+Out[22]: \n \n In [23]: ds.isel(x=slice(10, 20)).to_zarr(path, region={\"x\": \"auto\"})\n-Out[23]: \n+Out[23]: \n \n In [24]: ds.isel(x=slice(20, 30)).to_zarr(path, region={\"x\": slice(20, 30)})\n-Out[24]: \n+Out[24]: \n Concurrent writes with region are safe as long as they modify distinct chunks\n in the underlying Zarr arrays (or use an appropriate lock).\n As a safety check to make it harder to inadvertently override existing values,\n if you set region then a\bal\bll\bl variables included in a Dataset must have dimensions\n included in region. Other variables (typically coordinates) need to be\n explicitly dropped and/or written in a separate calls to to_zarr with mode='a'.\n *\b**\b**\b**\b* Z\bZa\bar\brr\br C\bCo\bom\bmp\bpr\bre\bes\bss\bso\bor\brs\bs a\ban\bnd\bd F\bFi\bil\blt\bte\ber\brs\bs_\b?\b\u00b6 *\b**\b**\b**\b*\n@@ -636,28 +636,28 @@\n ....: \"y\": [1, 2, 3, 4, 5],\n ....: \"t\": pd.date_range(\"2001-01-01\", periods=2),\n ....: },\n ....: )\n ....:\n \n In [30]: ds1.to_zarr(\"path/to/directory.zarr\")\n-Out[30]: \n+Out[30]: \n \n In [31]: ds2 = xr.Dataset(\n ....: {\"foo\": ((\"x\", \"y\", \"t\"), np.random.rand(4, 5, 2))},\n ....: coords={\n ....: \"x\": [10, 20, 30, 40],\n ....: \"y\": [1, 2, 3, 4, 5],\n ....: \"t\": pd.date_range(\"2001-01-03\", periods=2),\n ....: },\n ....: )\n ....:\n \n In [32]: ds2.to_zarr(\"path/to/directory.zarr\", append_dim=\"t\")\n-Out[32]: \n+Out[32]: \n *\b**\b**\b**\b* S\bSp\bpe\bec\bci\bif\bfy\byi\bin\bng\bg c\bch\bhu\bun\bnk\bks\bs i\bin\bn a\ba z\bza\bar\brr\br s\bst\bto\bor\bre\be_\b?\b\u00b6 *\b**\b**\b**\b*\n Chunk sizes may be specified in one of three ways when writing to a zarr store:\n 1. Manual chunk sizing through the use of the encoding argument in\n Dataset.to_zarr():\n 2. Automatic chunking based on chunks in dask arrays\n 3. Default chunk behavior determined by the zarr library\n The resulting chunks will be determined based on the order of the above list;\n@@ -676,15 +676,15 @@\n positional ordering of the dimensions in each array. Watch out for arrays with\n differently-ordered dimensions within a single Dataset.\n For example, let\u2019s say we\u2019re working with a dataset with dimensions ('time',\n 'x', 'y'), a variable Tair which is chunked in x and y, and two multi-\n dimensional coordinates xc and yc:\n In [33]: ds = xr.tutorial.open_dataset(\"rasm\")\n PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not\n-create data cache folder '/nonexistent/first-build/.cache/\n+create data cache folder '/nonexistent/second-build/.cache/\n xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [34]: ds[\"Tair\"] = ds[\"Tair\"].chunk({\"x\": 100, \"y\": 100})\n KeyError: \"No variable named 'Tair'. Variables on the dataset include ['foo',\n 'x']\"\n \n@@ -697,15 +697,15 @@\n * x (x) int32 120B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29\n Data variables:\n foo (x) int32 120B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29\n These multi-dimensional coordinates are only two-dimensional and take up very\n little space on disk or in memory, yet when writing to disk the default zarr\n behavior is to split them into chunks:\n In [36]: ds.to_zarr(\"path/to/directory.zarr\", mode=\"w\")\n-Out[36]: \n+Out[36]: \n \n In [37]: ! ls -R path/to/directory.zarr\n path/to/directory.zarr:\n foo x\tzarr.json\n \n path/to/directory.zarr/foo:\n c zarr.json\n@@ -858,15 +858,15 @@\n _\bN_\bc_\bd_\ba_\bt_\ba provides more sophisticated means of transferring data, including entire\n datasets. It uses the file saving and loading functions in both projects to\n provide a more \u201ccorrect\u201d translation between them, but still with very low\n overhead and not using actual disk files.\n For example:\n In [48]: ds = xr.tutorial.open_dataset(\"air_temperature_gradient\")\n PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not\n-create data cache folder '/nonexistent/first-build/.cache/\n+create data cache folder '/nonexistent/second-build/.cache/\n xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [49]: cubes = ncdata.iris_xarray.cubes_from_xarray(ds)\n NameError: name 'ncdata' is not defined\n \n \n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/plotting.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/plotting.html", "unified_diff": "@@ -100,15 +100,15 @@\n In [3]: import matplotlib.pyplot as plt\n \n In [4]: import xarray as xr\n
\n \n For these examples we\u2019ll use the North American air temperature dataset.
\n In [5]: airtemps = xr.tutorial.open_dataset("air_temperature")\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [6]: airtemps\n NameError: name 'airtemps' is not defined\n \n \n # Convert to celsius\n@@ -445,15 +445,15 @@\n \n # Apply a nonlinear transformation to one of the coords\n In [50]: b.coords["lat"] = np.log(b.coords["lat"])\n KeyError: 'lat'\n \n \n In [51]: b.plot()\n-Out[51]: [<matplotlib.lines.Line2D at 0xe03a1870>]\n+Out[51]: [<matplotlib.lines.Line2D at 0xdfffc660>]\n
\n \n
\n \n \n \n Other types of plot\u00b6
\n@@ -857,117 +857,117 @@\n * y (y) float64 88B 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0\n * z (z) int32 16B 0 1 2 3\n * w (w) <U5 80B 'one' 'two' 'three' 'five'\n Attributes:\n units: Aunits\n \n In [99]: ds.A.plot.scatter(x="y")\n-Out[99]: <matplotlib.collections.PathCollection at 0xe246cf48>\n+Out[99]: <matplotlib.collections.PathCollection at 0xe2177f48>\n
Same plot can be displayed using the dataset:
\nIn [100]: ds.plot.scatter(x="y", y="A")\n-Out[100]: <matplotlib.collections.PathCollection at 0xe01913a0>\n+Out[100]: <matplotlib.collections.PathCollection at 0xdffc3240>\n
Now suppose we want to scatter the A
DataArray against the B
DataArray
In [101]: ds.plot.scatter(x="A", y="B")\n-Out[101]: <matplotlib.collections.PathCollection at 0xe01ac2f0>\n+Out[101]: <matplotlib.collections.PathCollection at 0xdfde9190>\n
The hue
kwarg lets you vary the color by variable value
In [102]: ds.plot.scatter(x="A", y="B", hue="w")\n-Out[102]: <matplotlib.collections.PathCollection at 0xe01a7240>\n+Out[102]: <matplotlib.collections.PathCollection at 0xdfe17710>\n
You can force a legend instead of a colorbar by setting add_legend=True, add_colorbar=False
.
In [103]: ds.plot.scatter(x="A", y="B", hue="w", add_legend=True, add_colorbar=False)\n-Out[103]: <matplotlib.collections.PathCollection at 0xe01ac7c0>\n+Out[103]: <matplotlib.collections.PathCollection at 0xe132b9d0>\n
In [104]: ds.plot.scatter(x="A", y="B", hue="w", add_legend=False, add_colorbar=True)\n-Out[104]: <matplotlib.collections.PathCollection at 0xe015fdf0>\n+Out[104]: <matplotlib.collections.PathCollection at 0xdff8a920>\n
The markersize
kwarg lets you vary the point\u2019s size by variable value.\n You can additionally pass size_norm
to control how the variable\u2019s values are mapped to point sizes.
In [105]: ds.plot.scatter(x="A", y="B", hue="y", markersize="z")\n-Out[105]: <matplotlib.collections.PathCollection at 0xe012a030>\n+Out[105]: <matplotlib.collections.PathCollection at 0xdfd86190>\n
The z
kwarg lets you plot the data along the z-axis as well.
In [106]: ds.plot.scatter(x="A", y="B", z="z", hue="y", markersize="x")\n-Out[106]: <mpl_toolkits.mplot3d.art3d.Path3DCollection at 0xe014cea0>\n+Out[106]: <mpl_toolkits.mplot3d.art3d.Path3DCollection at 0xe200fbe0>\n
Faceting is also possible
\nIn [107]: ds.plot.scatter(x="A", y="B", hue="y", markersize="x", row="x", col="w")\n-Out[107]: <xarray.plot.facetgrid.FacetGrid at 0xe83d1030>\n+Out[107]: <xarray.plot.facetgrid.FacetGrid at 0xe80d03c8>\n
And adding the z-axis
\nIn [108]: ds.plot.scatter(x="A", y="B", z="z", hue="y", markersize="x", row="x", col="w")\n-Out[108]: <xarray.plot.facetgrid.FacetGrid at 0xdfaecea0>\n+Out[108]: <xarray.plot.facetgrid.FacetGrid at 0xdf74aea0>\n
For more advanced scatter plots, we recommend converting the relevant data variables\n to a pandas DataFrame and using the extensive plotting capabilities of seaborn
.
Visualizing vector fields is supported with quiver plots:
\nIn [109]: ds.isel(w=1, z=1).plot.quiver(x="x", y="y", u="A", v="B")\n-Out[109]: <matplotlib.quiver.Quiver at 0xe1ea23c8>\n+Out[109]: <matplotlib.quiver.Quiver at 0xe803d3c8>\n
where u
and v
denote the x and y direction components of the arrow vectors. Again, faceting is also possible:
In [110]: ds.plot.quiver(x="x", y="y", u="A", v="B", col="w", row="z", scale=4)\n-Out[110]: <xarray.plot.facetgrid.FacetGrid at 0xe49572d0>\n+Out[110]: <xarray.plot.facetgrid.FacetGrid at 0xe183c2d0>\n
scale
is required for faceted quiver plots.\n The scale determines the number of data units per arrow length unit, i.e. a smaller scale parameter makes the arrow longer.
Visualizing vector fields is also supported with streamline plots:
\nIn [111]: ds.isel(w=1, z=1).plot.streamplot(x="x", y="y", u="A", v="B")\n-Out[111]: <matplotlib.collections.LineCollection at 0xe00bd710>\n+Out[111]: <matplotlib.collections.LineCollection at 0xdfcdc710>\n
where u
and v
denote the x and y direction components of the vectors tangent to the streamlines.\n Again, faceting is also possible:
In [112]: ds.plot.streamplot(x="x", y="y", u="A", v="B", col="w", row="z")\n-Out[112]: <xarray.plot.facetgrid.FacetGrid at 0xe1c9eca8>\n+Out[112]: <xarray.plot.facetgrid.FacetGrid at 0xe02ed570>\n
To follow this section you\u2019ll need to have Cartopy installed and working.
\nThis script will plot the air temperature on a map.
\nIn [113]: import cartopy.crs as ccrs\n \n In [114]: air = xr.tutorial.open_dataset("air_temperature").air\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [115]: p = air.isel(time=0).plot(\n .....: subplot_kws=dict(projection=ccrs.Orthographic(-80, 35), facecolor="gray"),\n .....: transform=ccrs.PlateCarree(),\n .....: )\n .....: \n@@ -1024,24 +1024,24 @@\n In [121]: import xarray.plot as xplt\n \n In [122]: da = xr.DataArray(range(5))\n \n In [123]: fig, axs = plt.subplots(ncols=2, nrows=2)\n \n In [124]: da.plot(ax=axs[0, 0])\n-Out[124]: [<matplotlib.lines.Line2D at 0xdfefff50>]\n+Out[124]: [<matplotlib.lines.Line2D at 0xe7f6b870>]\n \n In [125]: da.plot.line(ax=axs[0, 1])\n-Out[125]: [<matplotlib.lines.Line2D at 0xdfaf4b30>]\n+Out[125]: [<matplotlib.lines.Line2D at 0xdf1db2f0>]\n \n In [126]: xplt.plot(da, ax=axs[1, 0])\n-Out[126]: [<matplotlib.lines.Line2D at 0xdf57d2f0>]\n+Out[126]: [<matplotlib.lines.Line2D at 0xdf1db3a0>]\n \n In [127]: xplt.line(da, ax=axs[1, 1])\n-Out[127]: [<matplotlib.lines.Line2D at 0xdf57d3a0>]\n+Out[127]: [<matplotlib.lines.Line2D at 0xdf1dbd40>]\n \n In [128]: plt.tight_layout()\n \n In [129]: plt.draw()\n
\n \n
\n@@ -1091,15 +1091,15 @@\n
The plot will produce an image corresponding to the values of the array.\n Hence the top left pixel will be a different color than the others.\n Before reading on, you may want to look at the coordinates and\n think carefully about what the limits, labels, and orientation for\n each of the axes should be.
\nIn [134]: a.plot()\n-Out[134]: <matplotlib.collections.QuadMesh at 0xde193870>\n+Out[134]: <matplotlib.collections.QuadMesh at 0xdddf1870>\n
It may seem strange that\n the values on the y axis are decreasing with -0.5 on the top. This is because\n the pixels are centered over their coordinates, and the\n@@ -1122,57 +1122,57 @@\n .....: np.arange(20).reshape(4, 5),\n .....: dims=["y", "x"],\n .....: coords={"lat": (("y", "x"), lat), "lon": (("y", "x"), lon)},\n .....: )\n .....: \n \n In [139]: da.plot.pcolormesh(x="lon", y="lat")\n-Out[139]: <matplotlib.collections.QuadMesh at 0xdf5afdf0>\n+Out[139]: <matplotlib.collections.QuadMesh at 0xdfc0ba80>\n
Note that in this case, xarray still follows the pixel centered convention.\n This might be undesirable in some cases, for example when your data is defined\n on a polar projection (GH781). This is why the default is to not follow\n this convention when plotting on a map:
\nIn [140]: import cartopy.crs as ccrs\n \n In [141]: ax = plt.subplot(projection=ccrs.PlateCarree())\n \n In [142]: da.plot.pcolormesh(x="lon", y="lat", ax=ax)\n-Out[142]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xde1fe920>\n+Out[142]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xdde0e240>\n \n In [143]: ax.scatter(lon, lat, transform=ccrs.PlateCarree())\n-Out[143]: <matplotlib.collections.PathCollection at 0xde121030>\n+Out[143]: <matplotlib.collections.PathCollection at 0xdddeb030>\n \n In [144]: ax.coastlines()\n-Out[144]: <cartopy.mpl.feature_artist.FeatureArtist at 0xe83c5af8>\n+Out[144]: <cartopy.mpl.feature_artist.FeatureArtist at 0xe80c6bb0>\n \n In [145]: ax.gridlines(draw_labels=True)\n-Out[145]: <cartopy.mpl.gridliner.Gridliner at 0xe032b538>\n+Out[145]: <cartopy.mpl.gridliner.Gridliner at 0xdffa31a0>\n
You can however decide to infer the cell boundaries and use the\n infer_intervals
keyword:
In [146]: ax = plt.subplot(projection=ccrs.PlateCarree())\n \n In [147]: da.plot.pcolormesh(x="lon", y="lat", ax=ax, infer_intervals=True)\n-Out[147]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xde0a3c90>\n+Out[147]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xddd04c90>\n \n In [148]: ax.scatter(lon, lat, transform=ccrs.PlateCarree())\n-Out[148]: <matplotlib.collections.PathCollection at 0xde0af3a0>\n+Out[148]: <matplotlib.collections.PathCollection at 0xddd103a0>\n \n In [149]: ax.coastlines()\n-Out[149]: <cartopy.mpl.feature_artist.FeatureArtist at 0xde0af450>\n+Out[149]: <cartopy.mpl.feature_artist.FeatureArtist at 0xddd10450>\n \n In [150]: ax.gridlines(draw_labels=True)\n-Out[150]: <cartopy.mpl.gridliner.Gridliner at 0xde0af500>\n+Out[150]: <cartopy.mpl.gridliner.Gridliner at 0xddd10500>\n
Note
\nThe data model of xarray does not support datasets with cell boundaries\n@@ -1180,26 +1180,26 @@\n outside the xarray framework.
\nOne can also make line plots with multidimensional coordinates. In this case, hue
must be a dimension name, not a coordinate name.
In [151]: f, ax = plt.subplots(2, 1)\n \n In [152]: da.plot.line(x="lon", hue="y", ax=ax[0])\n Out[152]: \n-[<matplotlib.lines.Line2D at 0xde0a97c0>,\n- <matplotlib.lines.Line2D at 0xddf3f030>,\n- <matplotlib.lines.Line2D at 0xddf3f0e0>,\n- <matplotlib.lines.Line2D at 0xddf3f190>]\n+[<matplotlib.lines.Line2D at 0xddd097c0>,\n+ <matplotlib.lines.Line2D at 0xddba0030>,\n+ <matplotlib.lines.Line2D at 0xddba00e0>,\n+ <matplotlib.lines.Line2D at 0xddba0190>]\n \n In [153]: da.plot.line(x="lon", hue="x", ax=ax[1])\n Out[153]: \n-[<matplotlib.lines.Line2D at 0xddf48500>,\n- <matplotlib.lines.Line2D at 0xddf485b0>,\n- <matplotlib.lines.Line2D at 0xddf48660>,\n- <matplotlib.lines.Line2D at 0xddf48710>,\n- <matplotlib.lines.Line2D at 0xddf487c0>]\n+[<matplotlib.lines.Line2D at 0xddba8500>,\n+ <matplotlib.lines.Line2D at 0xddba85b0>,\n+ <matplotlib.lines.Line2D at 0xddba8660>,\n+ <matplotlib.lines.Line2D at 0xddba8710>,\n+ <matplotlib.lines.Line2D at 0xddba87c0>]\n
Whilst coarsen
is normally used for reducing your data\u2019s resolution by applying a reduction function\n (see the page on computation),\n it can also be used to reorganise your data without applying a computation via construct()
.
Taking our example tutorial air temperature dataset over the Northern US
\nIn [56]: air = xr.tutorial.open_dataset("air_temperature")["air"]\n-PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/first-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n+PermissionError: [Errno 13] Permission denied: '/nonexistent' | Pooch could not create data cache folder '/nonexistent/second-build/.cache/xarray_tutorial_data'. Will not be able to download data files.\n \n \n In [57]: air.isel(time=0).plot(x="lon", y="lat")\n NameError: name 'air' is not defined\n
To see an example of what each of these strategies might produce, you can call one followed by the .example()
method,\n which is a general hypothesis method valid for all strategies.
In [2]: import xarray.testing.strategies as xrst\n \n In [3]: xrst.variables().example()\n Out[3]: \n-<xarray.Variable (\u017b\u017f5\u012e\u0143: 5)> Size: 20B\n-array([-2147435733, 48956106, -2147463450, -2147483495, 952662193])\n-Attributes:\n- \u0156\u0147\u00f9: False\n- \u0157T\u010aU\u00e6: None\n- W\u00ed2\u00ba\u00b9: [['(']]\n- \u00ee\u00ff\u00cd\u017f\u00d3: ['NaT' 'NaT']\n- : \u00c3\u0152\u00b2\u0122\u0166\n- pk\u0143: \u017d\u0131\n- \u0170NYS: None\n- \u00ba\u00d55J: G\u00ed\n- b\u017b: False\n- \u017c\u00f9\u016f\u0104m: True\n+<xarray.Variable (\u0171: 3, \u0176: 4)> Size: 96B\n+array([[ inf +infj, 1.000e+00 +infj, 6.104e-05+2.220e-16j,\n+ -0.000e+00-1.100e+00j],\n+ [ 0.000e+00 +infj, 0.000e+00-5.000e-01j, -3.381e+16 +infj,\n+ -1.192e-07 +infj],\n+ [ 1.192e-07-1.175e-38j, -5.000e-01+0.000e+00j, 0.000e+00-1.100e+00j,\n+ 1.192e-07-1.175e-38j]], shape=(3, 4), dtype=complex64)\n \n In [4]: xrst.variables().example()\n Out[4]: \n-<xarray.Variable (0: 1)> Size: 8B\n-array([0.+0.j], dtype=complex64)\n+<xarray.Variable (\u00e3\u017c: 1, \u0178\u00c6: 1)> Size: 4B\n+array([[4052997171]], dtype=uint32)\n+Attributes:\n+ \u017e: \u0161\u00da\u017b\u0129\u00df\n+ \u011d\u0126\u00c7b\u00cc: None\n+ : False\n+ \u017d\u00b3\u015a\u00ca\u017d: None\n+ \u00e2\u00fe\u017bp\u00c1: [['\u00e38\\x08\u00e3\\x1bp\\x1c\\U0010b29f' '\u00bd\\x7f\u00c4\\x03c\u00bd']\\n ['.\u00a3\\x0e\u00e6\\U000...\n+ \u0179\u00f4\u017d\u00ec\u0178: \n+ \u00fb\u00f9\u00d4\u0119\u015f: None\n+ \u017f\u012c\u017b\u0137\u00cd: False\n+ \u01678\u011b\u017e\u017b: None\n+ \u010f\u0137: 5\u012f\u00fe\u00d5\u017d\n+ Z: None\n \n In [5]: xrst.variables().example()\n Out[5]: \n-<xarray.Variable (0: 1)> Size: 8B\n-array([0.+0.j], dtype=complex64)\n+<xarray.Variable (V\u00be\u0115\u00c0: 2)> Size: 2B\n+array([36, 36], dtype=int8)\n+Attributes:\n+ Q\u00ee\u0123\u00e1\u00f2: True\n+ : None\n+ \u0145\u0122\u0164\u017b: None\n+ \u00cf: None\n
You can see that calling .example()
multiple times will generate different examples, giving you an idea of the wide\n range of data that the xarray strategies can generate.
In your tests however you should not use .example()
- instead you should parameterize your tests with the\n hypothesis.given()
decorator:
In [6]: from hypothesis import given\n@@ -132,94 +143,62 @@\n Xarray\u2019s strategies can accept other strategies as arguments, allowing you to customise the contents of the generated\n examples.
\n # generate a Variable containing an array with a complex number dtype, but all other details still arbitrary\n In [8]: from hypothesis.extra.numpy import complex_number_dtypes\n \n In [9]: xrst.variables(dtype=complex_number_dtypes()).example()\n Out[9]: \n-<xarray.Variable (\u00eaH\u017f\u017f: 6, \u00b3\u0169\u012fC: 5, \u00fa\u00f2\u0114: 1)> Size: 480B\n-array([[[ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j]],\n-\n- [[ 2.220e-016 -infj],\n- [-5.960e-008-2.056e-262j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j]],\n-\n- [[ 1.113e-308-4.941e-324j],\n- [ nan +nanj],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j]],\n-\n- [[ 2.225e-309-1.100e+000j],\n- [ 1.113e-308-4.941e-324j],\n- [-5.000e-001-2.225e-313j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j]],\n-\n- [[ 1.113e-308-4.941e-324j],\n- [ 3.949e+014+2.220e-016j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j]],\n-\n- [[ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 1.113e-308-4.941e-324j],\n- [ 5.000e-001-1.000e+000j]]], shape=(6, 5, 1))\n-Attributes:\n- \u00cc7Z\u013b\u015f: {'': array([9223372036854775806], dtype='timedelta64[us]'), '\u00e4q...\n- : {'\u017b': True, '\u0126j\u0173': array(['NaT'], dtype='datetime64[m]'), 'dO\u0157\u0168...\n- \u0147\u013az\u0131\u017f: {'\u010e\u017d\u0146': None, '\u00b3\u017d9': array([ 'NaT', -149645564191...\n+<xarray.Variable (\u00d5\u00be\u00be\u017f: 3, E\u017ft: 3)> Size: 144B\n+array([[-3.515e+016 -infj, -1.175e-038+1.401e-45j, -inf+1.900e+00j],\n+ [-1.500e+000+1.500e+00j, 1.144e+243 -infj, -3.333e-001+0.000e+00j],\n+ [ inf+1.900e+00j, inf-9.596e+15j, -5.000e-001 +nanj]], dtype='>c16')\n
\n \n This also works with custom strategies, or strategies defined in other packages.\n For example you could imagine creating a chunks
strategy to specify particular chunking patterns for a dask-backed array.
\n \n \n Fixing Arguments\u00b6
\n If you want to fix one aspect of the data structure, whilst allowing variation in the generated examples\n over all other aspects, then use hypothesis.strategies.just()
.
\n In [10]: import hypothesis.strategies as st\n \n # Generates only variable objects with dimensions ["x", "y"]\n In [11]: xrst.variables(dims=st.just(["x", "y"])).example()\n Out[11]: \n-<xarray.Variable (x: 1, y: 1)> Size: 8B\n-array([[0.+0.j]], dtype=complex64)\n+<xarray.Variable (x: 2, y: 2)> Size: 64B\n+array([[-1.113e-308 -1.j, nan+infj],\n+ [ -inf+infj, nan+infj]])\n
\n \n (This is technically another example of chaining strategies - hypothesis.strategies.just()
is simply a\n special strategy that just contains a single example.)
\n To fix the length of dimensions you can instead pass dims
as a mapping of dimension names to lengths\n (i.e. following xarray objects\u2019 .sizes()
property), e.g.
\n # Generates only variables with dimensions ["x", "y"], of lengths 2 & 3 respectively\n In [12]: xrst.variables(dims=st.just({"x": 2, "y": 3})).example()\n Out[12]: \n-<xarray.Variable (x: 2, y: 3)> Size: 12B\n-array([[-1.500e+00, -1.500e+00, 2.000e+00],\n- [-6.104e-05, inf, 0.000e+00]], dtype=float16)\n+<xarray.Variable (x: 2, y: 3)> Size: 48B\n+array([[-9223372036854742917, -7320804107538452131, 3778513437244823336],\n+ [-9223372036854750587, -9223372036854764399, -9223372036854765406]], dtype=int64)\n Attributes:\n- \u0110\u017f: {}\n- \u00dd\u017feG: {}\n- \u017b6: {'\u017d\u017c': array([b'\\x05'], dtype='|S6'), '\u0146\u0162\u017b\u015c': False, '\u017e\u013e\u017e0\u0141': '...\n+ \u017e\u00e8: ['' '\\x15']\n+ \u0107e: None\n+ \u017d\u00f9: None\n+ \u0118\u00e9\u0133\u00caS: \u0130\u00c6\u0149\n
\n \n You can also use this to specify that you want examples which are missing some part of the data structure, for instance
\n # Generates a Variable with no attributes\n In [13]: xrst.variables(attrs=st.just({})).example()\n Out[13]: \n-<xarray.Variable (\u00eb\u00e6\u0119\u0129\u017e: 2)> Size: 32B\n-array([ -inf-6.104e-005j, 5.e-324-1.798e+308j])\n+<xarray.Variable (\u00b3: 3, F\u00f8y\u012e: 3)> Size: 72B\n+array([[ 57204, 1339669549717624835, 159],\n+ [ 3389693394, 23216, 3031],\n+ [ 1275883880, 53237, 2035765567]], dtype=uint64)\n
\n \n Through a combination of chaining strategies and fixing arguments, you can specify quite complicated requirements on the\n objects your chained strategy will generate.
\n In [14]: fixed_x_variable_y_maybe_z = st.fixed_dictionaries(\n ....: {"x": st.just(2), "y": st.integers(3, 4)}, optional={"z": st.just(2)}\n ....: )\n@@ -228,40 +207,48 @@\n In [15]: fixed_x_variable_y_maybe_z.example()\n Out[15]: {'x': 2, 'y': 3, 'z': 2}\n \n In [16]: special_variables = xrst.variables(dims=fixed_x_variable_y_maybe_z)\n \n In [17]: special_variables.example()\n Out[17]: \n-<xarray.Variable (x: 2, y: 3, z: 2)> Size: 48B\n-array([[[-4.413e+16, 5.748e+16],\n- [-1.000e-05, -3.333e-01],\n- [ inf, -0.000e+00]],\n-\n- [[-0.000e+00, inf],\n- [ nan, -5.960e-08],\n- [ nan, nan]]], shape=(2, 3, 2), dtype=float32)\n+<xarray.Variable (x: 2, y: 4, z: 2)> Size: 256B\n+array([[[ 1.401e-045-4.941e-324j, 4.771e-206 +nanj],\n+ [ 2.225e-313 -infj, 2.225e-308-1.421e-231j],\n+ [-1.532e+245+1.798e+308j, 3.403e+038+1.798e+308j],\n+ [ 5.960e-008-2.225e-308j, 3.139e+016+2.225e-309j]],\n+\n+ [[ -inf+6.102e+016j, 9.007e+015 +nanj],\n+ [-1.500e+000 +nanj, -1.792e+016 -infj],\n+ [ 1.000e-005-1.401e-045j, 1.000e+000 -infj],\n+ [ nan +infj, 6.104e-005-1.206e+016j]]], shape=(2, 4, 2))\n Attributes:\n- : None\n- \u017b: [b'' b'']\n- \u00e5\u00fc: t\u0121\u015fn\n- \u0136bK\u00ed\u0115: None\n- \u0118\u00d8L\u017bM: True\n- \u00e2: \n- M: \u017f\n+ \u00ff\u017b\u017b\u0140\u017a: {'\u017cQKq\u017d': None, '': '\u00ccS\u00f9\u0108q', '\u0100\u017fV\u017d': None, 'b': None, 'q\u00fb\u00cd': No...\n \n In [18]: special_variables.example()\n Out[18]: \n-<xarray.Variable (x: 2, y: 3)> Size: 24B\n-array([[-2147483516, -2147420238, -2147483516],\n- [-2147483396, -2147471625, -2147483429]])\n-Attributes:\n- : {}\n- Th\u00c4\u00f8d: {'\u0171': False, '': array([[b'\\x08a\\xe8', b'\\x0b\\xb0']], dtype='|S...\n- \u017e\u013a: {}\n+<xarray.Variable (x: 2, y: 4)> Size: 128B\n+array([[-1.000e+07 +infj, -1.000e+07 +infj, inf+1.113e-308j,\n+ 1.175e-38 +infj],\n+ [ 5.960e-08+3.333e-001j, 5.960e-08+4.941e-324j, -inf-2.225e-311j,\n+ 9.007e+15+6.642e+016j]])\n+Attributes: (12/21)\n+ a\u014c\u017c: False\n+ : None\n+ a\u0141q\u00fa\u00aa: True\n+ \u015a\u00d4: True\n+ \u0119\u017c\u0123: ['']\n+ \u0179: [['NaT']]\n+ ... ...\n+ \u017b\u0109\u017f\u00f6K: [[b'']]\n+ \u00c4: None\n+ \u0105\u010f\u00b2\u015f\u00d4: [[ 0.+3.403e+38j]\\n [-inf-0.000e+00j]]\n+ \u00ea\u0125: [['\ua231\\x87\u00feM\u00fb\\x9e']\\n ['\\U00060fc0\\U0007a060\\\\\u00f2']]\n+ \u00e6: \u00ec\u017c0\u0170\u00d9\n+ \u00c9\u017e\u013e\u0148\u00e6: [[b'\\xbb\\xda']\\n [b'\\xe3\\xd09\\x0f\\xd9.Fr']]\n
\n \n Here we have used one of hypothesis\u2019 built-in strategies hypothesis.strategies.fixed_dictionaries()
to create a\n strategy which generates mappings of dimension names to lengths (i.e. the size
of the xarray object we want).\n This particular strategy will always generate an x
dimension of length 2, and a y
dimension of\n length either 3 or 4, and will sometimes also generate a z
dimension of length 2.\n By feeding this strategy for dictionaries into the dims
argument of xarray\u2019s variables()
strategy,\n@@ -362,50 +349,52 @@\n ....: array_strategy_fn=xps.arrays,\n ....: dtype=xps.scalar_dtypes(),\n ....: )\n ....: \n \n In [32]: xp_variables.example()\n Out[32]: \n-<xarray.Variable (\u0132EH\u011d: 2, \u0147\u012e\u00c8: 2, \u00c9\u00eb: 4)> Size: 64B\n-array([[[ 0. , 0. , 0. , 0. ],\n- [ 0. , 0. , 0. , 0. ]],\n-\n- [[ 0.333, 0. , 0. , -0. ],\n- [ 0. , 0. , 0. , 0. ]]], shape=(2, 2, 4), dtype=float32)\n+<xarray.Variable (\u00c5\u017b\u0171: 5, \u0165\u0168\u00ddJ\u010e: 6)> Size: 240B\n+array([[-3.403e+38 -infj, nan-0.000e+00j, -3.403e+38 -infj,\n+ -2.220e-16-5.384e+16j, nan -infj, -3.403e+38 -infj],\n+ [-0.000e+00-0.000e+00j, -3.403e+38 -infj, -3.403e+38 -infj,\n+ nan+3.333e-01j, 3.333e-01 +nanj, inf+1.100e+00j],\n+ [-3.403e+38 -infj, nan-0.000e+00j, 0.000e+00+0.000e+00j,\n+ -3.403e+38 -infj, -3.403e+38 -infj, -3.403e+38 -infj],\n+ [-0.000e+00+0.000e+00j, 0.000e+00 +nanj, -3.403e+38 -infj,\n+ 4.812e+16 +infj, -3.403e+38 -infj, -inf-1.401e-45j],\n+ [ -inf+0.000e+00j, inf+1.000e+07j, nan+1.100e+00j,\n+ 0.000e+00 +nanj, 0.000e+00-3.141e+15j, -inf+4.623e+16j]],\n+ shape=(5, 6), dtype=complex64)\n Attributes:\n- \u015f\u00f9\u017f\u00fd: \n- \u0104\u0170g\u00b2: \u00dc\u017c\u015d\u0168\u0145\n- \u016f: [ 12 -108]\n- \u0156\u017d\u0129\u017ds: \u00f3\u00bc\u0107\u00c5\u017e\n- \u00f6H: [['']]\n+ \u017b\u017d\u017e\u00e7\u017d: {'\u012a\u00db\u0125': True, 'y\u00c8er\u00dd': True, '\u0107U\u0120': array([[ 'NaT...\n
Another array API-compliant duck array library would replace the import, e.g. import cupy as cp
instead.
A common task when testing xarray user code is checking that your function works for all valid input dimensions.\n We can chain strategies to achieve this, for which the helper strategy unique_subset_of()
\n is useful.
It works for lists of dimension names
\nIn [33]: dims = ["x", "y", "z"]\n \n In [34]: xrst.unique_subset_of(dims).example()\n-Out[34]: ['y', 'z', 'x']\n+Out[34]: ['z', 'y']\n \n In [35]: xrst.unique_subset_of(dims).example()\n-Out[35]: ['x']\n+Out[35]: ['y', 'x', 'z']\n
as well as for mappings of dimension names to sizes
\nIn [36]: dim_sizes = {"x": 2, "y": 3, "z": 4}\n \n In [37]: xrst.unique_subset_of(dim_sizes).example()\n-Out[37]: {'x': 2, 'y': 3, 'z': 4}\n+Out[37]: {'y': 3}\n \n In [38]: xrst.unique_subset_of(dim_sizes).example()\n Out[38]: {'z': 4, 'y': 3}\n
This is useful because operations like reductions can be performed over any subset of the xarray object\u2019s dimensions.\n For example we can write a pytest test that tests that a reduction gives the expected result when applying that reduction\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -28,37 +28,49 @@\n To see an example of what each of these strategies might produce, you can call\n one followed by the .example() method, which is a general hypothesis method\n valid for all strategies.\n In [2]: import xarray.testing.strategies as xrst\n \n In [3]: xrst.variables().example()\n Out[3]:\n- New Added new methods xray.Dataset.where
method for masking xray objects according\n to some criteria. This works particularly well with multi-dimensional data:In [44]: ds = xray.Dataset(coords={"x": range(100), "y": range(100)})\n \n In [45]: ds["distance"] = np.sqrt(ds.x**2 + ds.y**2)\n \n In [46]: ds.distance.where(ds.distance < 100).plot()\n-Out[46]: <matplotlib.collections.QuadMesh at 0xe227b5b0>\n+Out[46]: <matplotlib.collections.QuadMesh at 0xdd515ea0>\n
\n \n
xray.DataArray.diff
and xray.Dataset.diff
\n for finite difference calculations along a given axis.