{"diffoscope-json-version": 1, "source1": "/srv/reproducible-results/rbuild-debian/r-b-build.2DKiz9TF/b1/python-xarray_2025.03.1-8_arm64.changes", "source2": "/srv/reproducible-results/rbuild-debian/r-b-build.2DKiz9TF/b2/python-xarray_2025.03.1-8_arm64.changes", "unified_diff": null, "details": [{"source1": "Files", "source2": "Files", "unified_diff": "@@ -1,3 +1,3 @@\n \n- 6dd6efcdef4f029d7295aee69485d405 5274584 doc optional python-xarray-doc_2025.03.1-8_all.deb\n+ d7d33546992ecc3b5bf58616ffc12a83 5275120 doc optional python-xarray-doc_2025.03.1-8_all.deb\n 42f0e777ab375981fc3c7e3f07caf930 820184 python optional python3-xarray_2025.03.1-8_all.deb\n"}, {"source1": "python-xarray-doc_2025.03.1-8_all.deb", "source2": "python-xarray-doc_2025.03.1-8_all.deb", "unified_diff": null, "details": [{"source1": "file list", "source2": "file list", "unified_diff": "@@ -1,3 +1,3 @@\n -rw-r--r-- 0 0 0 4 2025-05-05 09:04:03.000000 debian-binary\n -rw-r--r-- 0 0 0 7560 2025-05-05 09:04:03.000000 control.tar.xz\n--rw-r--r-- 0 0 0 5266832 2025-05-05 09:04:03.000000 data.tar.xz\n+-rw-r--r-- 0 0 0 5267368 2025-05-05 09:04:03.000000 data.tar.xz\n"}, {"source1": "control.tar.xz", "source2": "control.tar.xz", "unified_diff": null, "details": [{"source1": "control.tar", "source2": "control.tar", "unified_diff": null, "details": [{"source1": "./control", "source2": "./control", "unified_diff": "@@ -1,13 +1,13 @@\n Package: python-xarray-doc\n Source: python-xarray\n Version: 2025.03.1-8\n Architecture: all\n Maintainer: Debian Science Maintainers Let\u2019s create a simple plot of 2-m air temperature in degrees Celsius: Write equations to calculate the vertical coordinate. These will be only evaluated when data is requested. Information about the ROMS vertical coordinate can be found (here)[https://www.myroms.org/wiki/Vertical_S-coordinate] In short, for The function we will apply is Plot the first timestep: We first have to come up with the weights, - calculate the month length for each monthly data record - calculate weights using Finally, we just need to multiply our weights by the In this example, the logical coordinates are Control the map projection parameters on multiple axes This example illustrates how to plot multiple maps and control their extent and aspect ratio. For more details see this discussion on github. Visualizing your datasets is quick and convenient: Note the automatic labeling with names and units. Our effort in adding metadata attributes has paid off! Many aspects of these figures are customizable: see Plotting. Note This method replicates the behavior of [3]:\n
\n-Error in callback <function _draw_all_if_interactive at 0xffff5fa99b20> (for post_execute), with arguments args (),kwargs {}:\n+Error in callback <function _draw_all_if_interactive at 0xffff45de9b20> (for post_execute), with arguments args (),kwargs {}:\n
\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -93,15 +93,15 @@\n File /usr/lib/python3/dist-packages/urllib3/connection.py:205, in\n HTTPConnection._new_conn(self)\n 204 except socket.gaierror as e:\n --> 205 raise NameResolutionError(self.host, self, e) from e\n 206 except SocketTimeout as e:\n \n NameResolutionError:
Add a lazilly calculated vertical coordinates\u00b6
\n Vtransform==2
as used in this example,np.interp
which expects 1D numpy arrays. This functionality is already implemented in xarray so we use that capability to make sure we are not making mistakes.[2]:\n
[3]:\n
[ ]:\n
\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -92,15 +92,15 @@\n File /usr/lib/python3/dist-packages/urllib3/connection.py:205, in\n HTTPConnection._new_conn(self)\n 204 except socket.gaierror as e:\n --> 205 raise NameResolutionError(self.host, self, e) from e\n 206 except SocketTimeout as e:\n \n NameResolutionError:
Now for the heavy lifting:\u00b6
\n groupby('time.season')
Dataset
and sum along the time dimension. Creating a DataArray
for the month length is as easy as using the days_in_month
accessor on the time coordinate. The calendar type, in this case 'noleap'
, is automatically considered in this operation.x
and y
, while the physical coordinates are xc
and yc
, which represent the longitudes and latitudes of the data.[3]:\n
Multiple plots and map projections\u00b6
\n <xarray.Dataset> Size: 41kB\n Dimensions: (time: 731, location: 3)\n Coordinates:\n * time (time) datetime64[ns] 6kB 2000-01-01 2000-01-02 ... 2001-12-31\n * location (location) <U2 24B 'IA' 'IN' 'IL'\n Data variables:\n tmin (time, location) float64 18kB -8.037 -1.788 ... -1.346 -4.544\n- tmax (time, location) float64 18kB 12.98 3.31 6.779 ... 3.343 3.805
PandasIndex(Index(['IA', 'IN', 'IL'], dtype='object', name='location'))
Examine a dataset with pandas and seaborn\u00b6
\n Convert to a pandas DataFrame\u00b6
\n [2]:\n@@ -697,15 +697,15 @@\n
[5]:\n
\n-<seaborn.axisgrid.PairGrid at 0xffff68e8ecf0>\n+<seaborn.axisgrid.PairGrid at 0xffff43acecf0>\n
\n@@ -1110,26 +1110,26 @@\n [0. , 0. , 0. ],\n [0. , 0. , 0. ],\n [0. , 0.01612903, 0. ],\n [0.33333333, 0.35 , 0.23333333],\n [0.93548387, 0.85483871, 0.82258065]])\n Coordinates:\n * location (location) <U2 24B 'IA' 'IN' 'IL'\n- * month (month) int64 96B 1 2 3 4 5 6 7 8 9 10 11 12
array(['IA', 'IN', 'IL'], dtype='<U2')
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
PandasIndex(Index(['IA', 'IN', 'IL'], dtype='object', name='location'))
PandasIndex(Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], dtype='int64', name='month'))
[7]:\n
freeze.to_pandas().plot()\n
PandasIndex(Index(['IA', 'IN', 'IL'], dtype='object', name='location'))
[12]:\n
df = both.sel(time="2000").mean("location").reset_coords(drop=True).to_dataframe()\n df.head()\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -142,15 +142,15 @@\n [4]:\n
array(['IA', 'IN', 'IL'], dtype='<U2')
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
PandasIndex(Index(['IA', 'IN', 'IL'], dtype='object', name='location'))
PandasIndex(Index([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], dtype='int64', name='month'))
Plotting\u00b6
\n In [37]: data.plot()\n-Out[37]: <matplotlib.collections.QuadMesh at 0xffff48f16660>\n+Out[37]: <matplotlib.collections.QuadMesh at 0xffff6981e900>\n
\n
pandas\u00b6
\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -253,15 +253,15 @@\n [0.37342613, 1.49497537, 1.33584385]])\n Coordinates:\n * x (x) int64 16B 10 20\n Dimensions without coordinates: y\n *\b**\b**\b**\b**\b* P\bPl\blo\bot\btt\bti\bin\bng\bg_\b?\b\u00b6 *\b**\b**\b**\b**\b*\n Visualizing your datasets is quick and convenient:\n In [37]: data.plot()\n-Out[37]: apply_ufunc
\", \"Compare weighted and unweighted mean temperature\", \"Blank template\", \"Calculating Seasonal Averages from Time Series of Monthly Means\", \"Working with Multidimensional Coordinates\", \"Visualization Gallery\", \"Toy weather data\", \"Gallery\", \"Frequently Asked Questions\", \"Getting Started\", \"Installation\", \"Quick overview\", \"Overview: Why xarray?\", \"Getting Help\", \"How do I \\u2026\", \"Xarray documentation\", \"Alternative chunked array types\", \"Integrating with duck arrays\", \"Extending xarray using accessors\", \"How to add a new backend\", \"How to create a custom index\", \"Xarray Internals\", \"Internal Design\", \"Interoperability of Xarray\", \"Time Coding\", \"Zarr Encoding Specification\", \"Development roadmap\", \"Tutorials and Videos\", \"Combining data\", \"Computation\", \"Parallel Computing with Dask\", \"Data Structures\", \"Working with numpy-like arrays\", \"GroupBy: Group and Bin Data\", \"Hierarchical data\", \"User Guide\", \"Indexing and selecting data\", \"Interpolating data\", \"Reading and writing files\", \"Configuration\", \"Working with pandas\", \"Plotting\", \"Reshaping and reorganizing data\", \"Terminology\", \"Testing your code\", \"Time series data\", \"Weather and climate data\", \"What\\u2019s New\"],\n \"titleterms\": {\n \"\": [13, 16, 55],\n \"0\": 55,\n \"01\": 55,\n \"02\": 55,\n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/computation.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/computation.html", "unified_diff": "@@ -934,16 +934,16 @@\n <xarray.Dataset> Size: 2kB\n Dimensions: (param: 10, cov_i: 10, cov_j: 10)\n Coordinates:\n * param (param) <U7 280B 'a0' 'xc0' ... 'xalpha1' 'yalpha1'\n * cov_i (cov_i) <U7 280B 'a0' 'xc0' ... 'xalpha1' 'yalpha1'\n * cov_j (cov_j) <U7 280B 'a0' 'xc0' ... 'xalpha1' 'yalpha1'\n Data variables:\n- curvefit_coefficients (param) float64 80B -0.659 4.858 ... 2.066 1.329\n- curvefit_covariance (cov_i, cov_j) float64 800B 5.662e+11 ... 6.911e-05\n+ curvefit_coefficients (param) float64 80B 3.0 1.004 1.003 ... 1.007 1.008\n+ curvefit_covariance (cov_i, cov_j) float64 800B 3.362e-05 ... 2.125e-05\n \n \n scipy.optimize.curve_fit()
.func(ds)
). This allows you to write pipelines for\n transforming your data (using \u201cmethod chaining\u201d) instead of writing hard to\n follow nested function calls:
# these lines are equivalent, but with pipe we can make the logic flow\n # entirely from left to right\n In [64]: plt.plot((2 * ds.temperature.sel(loc=0)).mean("instrument"))\n-Out[64]: [<matplotlib.lines.Line2D at 0xffff20f17250>]\n+Out[64]: [<matplotlib.lines.Line2D at 0xffff5e193250>]\n \n In [65]: (ds.temperature.sel(loc=0).pipe(lambda x: 2 * x).mean("instrument").pipe(plt.plot))\n-Out[65]: [<matplotlib.lines.Line2D at 0xffff20f16fd0>]\n+Out[65]: [<matplotlib.lines.Line2D at 0xffff5e192fd0>]\n
Both pipe
and assign
replicate the pandas methods of the same names\n (DataFrame.pipe
and\n DataFrame.assign
).
With xarray, there is no performance penalty for creating new datasets, even if\n variables are lazily loaded from a file on disk. Creating new objects instead\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -585,19 +585,19 @@\n There is also the pipe() method that allows you to use a method call with an\n external function (e.g., ds.pipe(func)) instead of simply calling it (e.g.,\n func(ds)). This allows you to write pipelines for transforming your data (using\n \u201cmethod chaining\u201d) instead of writing hard to follow nested function calls:\n # these lines are equivalent, but with pipe we can make the logic flow\n # entirely from left to right\n In [64]: plt.plot((2 * ds.temperature.sel(loc=0)).mean(\"instrument\"))\n-Out[64]: [ If you were a previous user of the prototype xarray-contrib/datatree package, this is different from what you\u2019re used to!\n In that package the data model was that the data stored in each node actually was completely unrelated. The data model is now slightly stricter.\n This allows us to provide features like Coordinate Inheritance. To demonstrate, let\u2019s first generate some example datasets which are not aligned with one another: Now we have a valid This is a useful way to organise our data because we can still operate on all the groups at once.\n For example we can extract all three timeseries at a specific lat-lon location: or compute the standard deviation of each timeseries to find out how it varies with sampling frequency: This helps to differentiate which variables are defined on the datatree node that you are currently looking at, and which were defined somewhere above it. We can also still perform all the same operations on the whole tree:# (drop the attributes just to make the printed representation shorter)\n In [89]: ds = xr.tutorial.open_dataset("air_temperature").drop_attrs()\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff1f9e6990>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5cc22e90>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n In [90]: ds_daily = ds.resample(time="D").mean("time")\n KeyError: "No variable named 'time'. Variables on the dataset include ['foo', 'x', 'letters']"\n \n \n In [91]: ds_weekly = ds.resample(time="W").mean("time")\n@@ -1055,15 +1055,15 @@\n \u2514\u2500\u2500 Group: /b/B\n
DataTree
structure which contains all the data at each different time frequency, stored in a separate group.In [100]: dt.sel(lat=75, lon=300)\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of set()\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of set()\n
In [101]: dt.std(dim="time")\n ValueError: Dimension(s) 'time' do not exist. Expected one or more of set()\n
In [107]: print(dt["/daily"])\n KeyError: 'Could not find node at /daily'\n
In [108]: dt.sel(lat=[75], lon=[300])\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of set()\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of set()\n \n \n In [109]: dt.std(dim="time")\n ValueError: Dimension(s) 'time' do not exist. Expected one or more of set()\n
In [52]: ds = xr.tutorial.open_dataset("air_temperature")\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff1f9e6990>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5cc22c10>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n # Define target latitude and longitude (where weather stations might be)\n In [53]: target_lon = xr.DataArray([200, 201, 202, 205], dims="points")\n \n In [54]: target_lat = xr.DataArray([31, 41, 42, 42], dims="points")\n \n@@ -697,15 +697,15 @@\n
To select and assign values to a portion of a DataArray()
you\n can use indexing with .loc
:
In [57]: ds = xr.tutorial.open_dataset("air_temperature")\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff1f9e5a90>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5cc21090>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n # add an empty 2D dataarray\n In [58]: ds["empty"] = xr.full_like(ds.air.mean("time"), fill_value=0)\n AttributeError: 'Dataset' object has no attribute 'air'\n \n \n@@ -869,15 +869,15 @@\n
You can also assign values to all variables of a Dataset
at once:
In [83]: ds_org = xr.tutorial.open_dataset("eraint_uvz").isel(\n ....: latitude=slice(56, 59), longitude=slice(255, 258), level=0\n ....: )\n ....: \n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/eraint_uvz.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff1f9e5450>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/eraint_uvz.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5cc20550>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n # set all values to 0\n In [84]: ds = xr.zeros_like(ds_org)\n NameError: name 'ds_org' is not defined\n \n \n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -476,15 +476,15 @@\n with a new shared output dimension name. In the example below, the selections\n of the closest latitude and longitude are renamed to an output dimension named\n \u201cpoints\u201d:\n In [52]: ds = xr.tutorial.open_dataset(\"air_temperature\")\n ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries\n exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by\n NameResolutionError(\": Failed to resolve 'github.com' ([Errno -3] Temporary failure\n+0xffff5cc22c10>: Failed to resolve 'github.com' ([Errno -3] Temporary failure\n in name resolution)\"))\n \n \n # Define target latitude and longitude (where weather stations might be)\n In [53]: target_lon = xr.DataArray([200, 201, 202, 205], dims=\"points\")\n \n In [54]: target_lat = xr.DataArray([31, 41, 42, 42], dims=\"points\")\n@@ -516,15 +516,15 @@\n *\b**\b**\b**\b**\b* A\bAs\bss\bsi\big\bgn\bni\bin\bng\bg v\bva\bal\blu\bue\bes\bs w\bwi\bit\bth\bh i\bin\bnd\bde\bex\bxi\bin\bng\bg_\b?\b\u00b6 *\b**\b**\b**\b**\b*\n To select and assign values to a portion of a DataArray() you can use indexing\n with .loc :\n In [57]: ds = xr.tutorial.open_dataset(\"air_temperature\")\n ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries\n exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by\n NameResolutionError(\": Failed to resolve 'github.com' ([Errno -3] Temporary failure\n+0xffff5cc21090>: Failed to resolve 'github.com' ([Errno -3] Temporary failure\n in name resolution)\"))\n \n \n # add an empty 2D dataarray\n In [58]: ds[\"empty\"] = xr.full_like(ds.air.mean(\"time\"), fill_value=0)\n AttributeError: 'Dataset' object has no attribute 'air'\n \n@@ -678,15 +678,15 @@\n In [83]: ds_org = xr.tutorial.open_dataset(\"eraint_uvz\").isel(\n ....: latitude=slice(56, 59), longitude=slice(255, 258), level=0\n ....: )\n ....:\n ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries\n exceeded with url: /pydata/xarray-data/raw/master/eraint_uvz.nc (Caused by\n NameResolutionError(\": Failed to resolve 'github.com' ([Errno -3] Temporary failure\n+0xffff5cc20550>: Failed to resolve 'github.com' ([Errno -3] Temporary failure\n in name resolution)\"))\n \n \n # set all values to 0\n In [84]: ds = xr.zeros_like(ds_org)\n NameError: name 'ds_org' is not defined\n \n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/interpolation.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/interpolation.html", "unified_diff": "@@ -237,24 +237,24 @@\n ....: np.sin(np.linspace(0, 2 * np.pi, 10)),\n ....: dims="x",\n ....: coords={"x": np.linspace(0, 1, 10)},\n ....: )\n ....: \n \n In [17]: da.plot.line("o", label="original")\n-Out[17]: [<matplotlib.lines.Line2D at 0xffff5a461a90>]\n+Out[17]: [<matplotlib.lines.Line2D at 0xffff69839a90>]\n \n In [18]: da.interp(x=np.linspace(0, 1, 100)).plot.line(label="linear (default)")\n-Out[18]: [<matplotlib.lines.Line2D at 0xffff5a4616d0>]\n+Out[18]: [<matplotlib.lines.Line2D at 0xffff698396d0>]\n \n In [19]: da.interp(x=np.linspace(0, 1, 100), method="cubic").plot.line(label="cubic")\n-Out[19]: [<matplotlib.lines.Line2D at 0xffff20f16fd0>]\n+Out[19]: [<matplotlib.lines.Line2D at 0xffff5e192fd0>]\n \n In [20]: plt.legend()\n-Out[20]: <matplotlib.legend.Legend at 0xffff48f17380>\n+Out[20]: <matplotlib.legend.Legend at 0xffff6981f380>\n
Additional keyword arguments can be passed to scipy\u2019s functions.
\n# fill 0 for the outside of the original coordinates.\n In [21]: da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": 0.0})\n@@ -439,15 +439,15 @@\n see Missing values.\n \n \n Example\u00b6
\n Let\u2019s see how interp()
works on real data.
\n # Raw data\n In [44]: ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5a461e50>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff69839e50>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n In [45]: fig, axes = plt.subplots(ncols=2, figsize=(10, 4))\n \n In [46]: ds.air.plot(ax=axes[0])\n AttributeError: 'Dataset' object has no attribute 'air'\n \n@@ -511,15 +511,15 @@\n ....: axes[0].plot(*xr.broadcast(lon.isel(z=idx), lat.isel(z=idx)), "--k")\n ....: \n \n In [61]: axes[0].set_title("Raw data")\n Out[61]: Text(0.5, 1.0, 'Raw data')\n \n In [62]: dsi = ds.interp(lon=lon, lat=lat)\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of FrozenMappingWarningOnValuesAccess({'x': 3, 'y': 4})\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of FrozenMappingWarningOnValuesAccess({'x': 3, 'y': 4})\n \n \n In [63]: dsi.air.plot(ax=axes[1])\n NameError: name 'dsi' is not defined\n \n \n In [64]: axes[1].set_title("Remapped data")\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -153,26 +153,26 @@\n ....: np.sin(np.linspace(0, 2 * np.pi, 10)),\n ....: dims=\"x\",\n ....: coords={\"x\": np.linspace(0, 1, 10)},\n ....: )\n ....:\n \n In [17]: da.plot.line(\"o\", label=\"original\")\n-Out[17]: []\n+Out[17]: []\n \n In [18]: da.interp(x=np.linspace(0, 1, 100)).plot.line(label=\"linear\n (default)\")\n-Out[18]: []\n+Out[18]: []\n \n In [19]: da.interp(x=np.linspace(0, 1, 100), method=\"cubic\").plot.line\n (label=\"cubic\")\n-Out[19]: []\n+Out[19]: []\n \n In [20]: plt.legend()\n-Out[20]: \n+Out[20]: \n _\b[_\b._\b._\b/_\b__\bi_\bm_\ba_\bg_\be_\bs_\b/_\bi_\bn_\bt_\be_\br_\bp_\bo_\bl_\ba_\bt_\bi_\bo_\bn_\b__\bs_\ba_\bm_\bp_\bl_\be_\b1_\b._\bp_\bn_\bg_\b]\n Additional keyword arguments can be passed to scipy\u2019s functions.\n # fill 0 for the outside of the original coordinates.\n In [21]: da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={\"fill_value\": 0.0})\n Out[21]:\n Size: 80B\n array([ 0. , 0. , 0. , 0.814, 0.604, -0.604, -0.814, 0. , 0. ,\n@@ -338,15 +338,15 @@\n *\b**\b**\b**\b**\b* E\bEx\bxa\bam\bmp\bpl\ble\be_\b?\b\u00b6 *\b**\b**\b**\b**\b*\n Let\u2019s see how interp() works on real data.\n # Raw data\n In [44]: ds = xr.tutorial.open_dataset(\"air_temperature\").isel(time=0)\n ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries\n exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by\n NameResolutionError(\": Failed to resolve 'github.com' ([Errno -3] Temporary failure\n+0xffff69839e50>: Failed to resolve 'github.com' ([Errno -3] Temporary failure\n in name resolution)\"))\n \n \n In [45]: fig, axes = plt.subplots(ncols=2, figsize=(10, 4))\n \n In [46]: ds.air.plot(ax=axes[0])\n AttributeError: 'Dataset' object has no attribute 'air'\n@@ -411,15 +411,15 @@\n k\")\n ....:\n \n In [61]: axes[0].set_title(\"Raw data\")\n Out[61]: Text(0.5, 1.0, 'Raw data')\n \n In [62]: dsi = ds.interp(lon=lon, lat=lat)\n-ValueError: Dimensions {'lon', 'lat'} do not exist. Expected one or more of\n+ValueError: Dimensions {'lat', 'lon'} do not exist. Expected one or more of\n FrozenMappingWarningOnValuesAccess({'x': 3, 'y': 4})\n \n \n In [63]: dsi.air.plot(ax=axes[1])\n NameError: name 'dsi' is not defined\n \n \n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/io.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/io.html", "unified_diff": "@@ -630,15 +630,15 @@\n ....: "y": pd.date_range("2000-01-01", periods=5),\n ....: "z": ("x", list("abcd")),\n ....: },\n ....: )\n ....: \n \n In [13]: ds.to_zarr("path/to/directory.zarr")\n-Out[13]: <xarray.backends.zarr.ZarrStore at 0xffff1fe3dbd0>\n+Out[13]: <xarray.backends.zarr.ZarrStore at 0xffff5d17a0e0>\n
\n \n (The suffix .zarr
is optional\u2013just a reminder that a zarr store lives\n there.) If the directory does not exist, it will be created. If a zarr\n store is already present at that path, an error will be raised, preventing it\n from being overwritten. To override this behavior and overwrite an existing\n store, add mode='w'
when invoking to_zarr()
.
\n@@ -658,19 +658,19 @@\n To read back a zarr dataset that has been created this way, we use the\n open_zarr()
method:
\n In [14]: ds_zarr = xr.open_zarr("path/to/directory.zarr")\n \n In [15]: ds_zarr\n Out[15]: \n <xarray.Dataset> Size: 264B\n-Dimensions: (y: 5, x: 4)\n+Dimensions: (x: 4, y: 5)\n Coordinates:\n- * y (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05\n- * x (x) int64 32B 10 20 30 40\n z (x) object 32B dask.array<chunksize=(4,), meta=np.ndarray>\n+ * x (x) int64 32B 10 20 30 40\n+ * y (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05\n Data variables:\n foo (x, y) float64 160B dask.array<chunksize=(4, 5), meta=np.ndarray>\n
\n \n \n Cloud Storage Buckets\u00b6
\n It is possible to read and write xarray datasets directly from / to cloud\n@@ -724,36 +724,36 @@\n \n In [18]: ds = xr.Dataset({"foo": ("x", dummies)}, coords={"x": np.arange(30)})\n \n In [19]: path = "path/to/directory.zarr"\n \n # Now we write the metadata without computing any array values\n In [20]: ds.to_zarr(path, compute=False)\n-Out[20]: Delayed('_finalize_store-6768267c-47e4-4996-be9d-3b09c8b6f322')\n+Out[20]: Delayed('_finalize_store-c1ff047f-1306-4b87-b8dd-3eb0c4b41a69')\n
Now, a Zarr store with the correct variable shapes and attributes exists that\n can be filled out by subsequent calls to to_zarr
.\n Setting region="auto"
will open the existing store and determine the\n correct alignment of the new data with the existing dimensions, or as an\n explicit mapping from dimension names to Python slice
objects indicating\n where the data should be written (in index space, not label space), e.g.,
# For convenience, we'll slice a single dataset, but in the real use-case\n # we would create them separately possibly even from separate processes.\n In [21]: ds = xr.Dataset({"foo": ("x", np.arange(30))}, coords={"x": np.arange(30)})\n \n # Any of the following region specifications are valid\n In [22]: ds.isel(x=slice(0, 10)).to_zarr(path, region="auto")\n-Out[22]: <xarray.backends.zarr.ZarrStore at 0xffff1fe3eef0>\n+Out[22]: <xarray.backends.zarr.ZarrStore at 0xffff5d17b400>\n \n In [23]: ds.isel(x=slice(10, 20)).to_zarr(path, region={"x": "auto"})\n-Out[23]: <xarray.backends.zarr.ZarrStore at 0xffff1fe3f010>\n+Out[23]: <xarray.backends.zarr.ZarrStore at 0xffff5d17b370>\n \n In [24]: ds.isel(x=slice(20, 30)).to_zarr(path, region={"x": slice(20, 30)})\n-Out[24]: <xarray.backends.zarr.ZarrStore at 0xffff20218e50>\n+Out[24]: <xarray.backends.zarr.ZarrStore at 0xffff5d341360>\n
Concurrent writes with region
are safe as long as they modify distinct\n chunks in the underlying Zarr arrays (or use an appropriate lock
).
As a safety check to make it harder to inadvertently override existing values,\n if you set region
then all variables included in a Dataset must have\n dimensions included in region
. Other variables (typically coordinates)\n@@ -816,28 +816,28 @@\n ....: "y": [1, 2, 3, 4, 5],\n ....: "t": pd.date_range("2001-01-01", periods=2),\n ....: },\n ....: )\n ....: \n \n In [30]: ds1.to_zarr("path/to/directory.zarr")\n-Out[30]: <xarray.backends.zarr.ZarrStore at 0xffff1fef0160>\n+Out[30]: <xarray.backends.zarr.ZarrStore at 0xffff5d040670>\n \n In [31]: ds2 = xr.Dataset(\n ....: {"foo": (("x", "y", "t"), np.random.rand(4, 5, 2))},\n ....: coords={\n ....: "x": [10, 20, 30, 40],\n ....: "y": [1, 2, 3, 4, 5],\n ....: "t": pd.date_range("2001-01-03", periods=2),\n ....: },\n ....: )\n ....: \n \n In [32]: ds2.to_zarr("path/to/directory.zarr", append_dim="t")\n-Out[32]: <xarray.backends.zarr.ZarrStore at 0xffff1fef00d0>\n+Out[32]: <xarray.backends.zarr.ZarrStore at 0xffff5d0405e0>\n
Chunk sizes may be specified in one of three ways when writing to a zarr store:
\nFor example, let\u2019s say we\u2019re working with a dataset with dimensions\n ('time', 'x', 'y')
, a variable Tair
which is chunked in x
and y
,\n and two multi-dimensional coordinates xc
and yc
:
In [33]: ds = xr.tutorial.open_dataset("rasm")\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/rasm.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff1ff57610>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/rasm.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5d044190>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n In [34]: ds["Tair"] = ds["Tair"].chunk({"x": 100, "y": 100})\n KeyError: "No variable named 'Tair'. Variables on the dataset include ['foo', 'x']"\n \n \n In [35]: ds\n@@ -882,15 +882,15 @@\n foo (x) int64 240B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29\n
These multi-dimensional coordinates are only two-dimensional and take up very little\n space on disk or in memory, yet when writing to disk the default zarr behavior is to\n split them into chunks:
\nIn [36]: ds.to_zarr("path/to/directory.zarr", mode="w")\n-Out[36]: <xarray.backends.zarr.ZarrStore at 0xffff1fef0550>\n+Out[36]: <xarray.backends.zarr.ZarrStore at 0xffff5d040a60>\n \n In [37]: ! ls -R path/to/directory.zarr\n path/to/directory.zarr:\n foo x\tzarr.json\n \n path/to/directory.zarr/foo:\n c zarr.json\n@@ -1081,15 +1081,15 @@\n Ncdata\u00b6
\n Ncdata provides more sophisticated means of transferring data, including entire\n datasets. It uses the file saving and loading functions in both projects to provide a\n more \u201ccorrect\u201d translation between them, but still with very low overhead and not\n using actual disk files.
\n For example:
\n In [48]: ds = xr.tutorial.open_dataset("air_temperature_gradient")\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature_gradient.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff69df0cd0>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature_gradient.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5d0456d0>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n In [49]: cubes = ncdata.iris_xarray.cubes_from_xarray(ds)\n NameError: name 'ncdata' is not defined\n \n \n In [50]: print(cubes)\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -481,15 +481,15 @@\n ....: \"y\": pd.date_range(\"2000-01-01\", periods=5),\n ....: \"z\": (\"x\", list(\"abcd\")),\n ....: },\n ....: )\n ....:\n \n In [13]: ds.to_zarr(\"path/to/directory.zarr\")\n-Out[13]: \n+Out[13]: \n (The suffix .zarr is optional\u2013just a reminder that a zarr store lives there.)\n If the directory does not exist, it will be created. If a zarr store is already\n present at that path, an error will be raised, preventing it from being\n overwritten. To override this behavior and overwrite an existing store, add\n mode='w' when invoking to_zarr().\n DataArrays can also be saved to disk using the DataArray.to_zarr() method, and\n loaded from disk using the open_dataarray() function with engine='zarr'.\n@@ -505,19 +505,19 @@\n To read back a zarr dataset that has been created this way, we use the\n open_zarr() method:\n In [14]: ds_zarr = xr.open_zarr(\"path/to/directory.zarr\")\n \n In [15]: ds_zarr\n Out[15]:\n Size: 264B\n-Dimensions: (y: 5, x: 4)\n+Dimensions: (x: 4, y: 5)\n Coordinates:\n- * y (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05\n- * x (x) int64 32B 10 20 30 40\n z (x) object 32B dask.array\n+ * x (x) int64 32B 10 20 30 40\n+ * y (y) datetime64[ns] 40B 2000-01-01 2000-01-02 ... 2000-01-05\n Data variables:\n foo (x, y) float64 160B dask.array\n *\b**\b**\b**\b* C\bCl\blo\bou\bud\bd S\bSt\bto\bor\bra\bag\bge\be B\bBu\buc\bck\bke\bet\bts\bs_\b?\b\u00b6 *\b**\b**\b**\b*\n It is possible to read and write xarray datasets directly from / to cloud\n storage buckets using zarr. This example uses the _\bg_\bc_\bs_\bf_\bs package to provide an\n interface to _\bG_\bo_\bo_\bg_\bl_\be_\b _\bC_\bl_\bo_\bu_\bd_\b _\bS_\bt_\bo_\br_\ba_\bg_\be.\n General _\bf_\bs_\bs_\bp_\be_\bc URLs, those that begin with s3:// or gcs:// for example, are\n@@ -562,35 +562,35 @@\n \n In [18]: ds = xr.Dataset({\"foo\": (\"x\", dummies)}, coords={\"x\": np.arange(30)})\n \n In [19]: path = \"path/to/directory.zarr\"\n \n # Now we write the metadata without computing any array values\n In [20]: ds.to_zarr(path, compute=False)\n-Out[20]: Delayed('_finalize_store-6768267c-47e4-4996-be9d-3b09c8b6f322')\n+Out[20]: Delayed('_finalize_store-c1ff047f-1306-4b87-b8dd-3eb0c4b41a69')\n Now, a Zarr store with the correct variable shapes and attributes exists that\n can be filled out by subsequent calls to to_zarr. Setting region=\"auto\" will\n open the existing store and determine the correct alignment of the new data\n with the existing dimensions, or as an explicit mapping from dimension names to\n Python slice objects indicating where the data should be written (in index\n space, not label space), e.g.,\n # For convenience, we'll slice a single dataset, but in the real use-case\n # we would create them separately possibly even from separate processes.\n In [21]: ds = xr.Dataset({\"foo\": (\"x\", np.arange(30))}, coords={\"x\": np.arange\n (30)})\n \n # Any of the following region specifications are valid\n In [22]: ds.isel(x=slice(0, 10)).to_zarr(path, region=\"auto\")\n-Out[22]: \n+Out[22]: \n \n In [23]: ds.isel(x=slice(10, 20)).to_zarr(path, region={\"x\": \"auto\"})\n-Out[23]: \n+Out[23]: \n \n In [24]: ds.isel(x=slice(20, 30)).to_zarr(path, region={\"x\": slice(20, 30)})\n-Out[24]: \n+Out[24]: \n Concurrent writes with region are safe as long as they modify distinct chunks\n in the underlying Zarr arrays (or use an appropriate lock).\n As a safety check to make it harder to inadvertently override existing values,\n if you set region then a\bal\bll\bl variables included in a Dataset must have dimensions\n included in region. Other variables (typically coordinates) need to be\n explicitly dropped and/or written in a separate calls to to_zarr with mode='a'.\n *\b**\b**\b**\b* Z\bZa\bar\brr\br C\bCo\bom\bmp\bpr\bre\bes\bss\bso\bor\brs\bs a\ban\bnd\bd F\bFi\bil\blt\bte\ber\brs\bs_\b?\b\u00b6 *\b**\b**\b**\b*\n@@ -636,28 +636,28 @@\n ....: \"y\": [1, 2, 3, 4, 5],\n ....: \"t\": pd.date_range(\"2001-01-01\", periods=2),\n ....: },\n ....: )\n ....:\n \n In [30]: ds1.to_zarr(\"path/to/directory.zarr\")\n-Out[30]: \n+Out[30]: \n \n In [31]: ds2 = xr.Dataset(\n ....: {\"foo\": ((\"x\", \"y\", \"t\"), np.random.rand(4, 5, 2))},\n ....: coords={\n ....: \"x\": [10, 20, 30, 40],\n ....: \"y\": [1, 2, 3, 4, 5],\n ....: \"t\": pd.date_range(\"2001-01-03\", periods=2),\n ....: },\n ....: )\n ....:\n \n In [32]: ds2.to_zarr(\"path/to/directory.zarr\", append_dim=\"t\")\n-Out[32]: \n+Out[32]: \n *\b**\b**\b**\b* S\bSp\bpe\bec\bci\bif\bfy\byi\bin\bng\bg c\bch\bhu\bun\bnk\bks\bs i\bin\bn a\ba z\bza\bar\brr\br s\bst\bto\bor\bre\be_\b?\b\u00b6 *\b**\b**\b**\b*\n Chunk sizes may be specified in one of three ways when writing to a zarr store:\n 1. Manual chunk sizing through the use of the encoding argument in\n Dataset.to_zarr():\n 2. Automatic chunking based on chunks in dask arrays\n 3. Default chunk behavior determined by the zarr library\n The resulting chunks will be determined based on the order of the above list;\n@@ -678,15 +678,15 @@\n For example, let\u2019s say we\u2019re working with a dataset with dimensions ('time',\n 'x', 'y'), a variable Tair which is chunked in x and y, and two multi-\n dimensional coordinates xc and yc:\n In [33]: ds = xr.tutorial.open_dataset(\"rasm\")\n ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries\n exceeded with url: /pydata/xarray-data/raw/master/rasm.nc (Caused by\n NameResolutionError(\": Failed to resolve 'github.com' ([Errno -3] Temporary failure\n+0xffff5d044190>: Failed to resolve 'github.com' ([Errno -3] Temporary failure\n in name resolution)\"))\n \n \n In [34]: ds[\"Tair\"] = ds[\"Tair\"].chunk({\"x\": 100, \"y\": 100})\n KeyError: \"No variable named 'Tair'. Variables on the dataset include ['foo',\n 'x']\"\n \n@@ -699,15 +699,15 @@\n * x (x) int64 240B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29\n Data variables:\n foo (x) int64 240B 0 1 2 3 4 5 6 7 8 9 ... 21 22 23 24 25 26 27 28 29\n These multi-dimensional coordinates are only two-dimensional and take up very\n little space on disk or in memory, yet when writing to disk the default zarr\n behavior is to split them into chunks:\n In [36]: ds.to_zarr(\"path/to/directory.zarr\", mode=\"w\")\n-Out[36]: \n+Out[36]: \n \n In [37]: ! ls -R path/to/directory.zarr\n path/to/directory.zarr:\n foo x\tzarr.json\n \n path/to/directory.zarr/foo:\n c zarr.json\n@@ -874,15 +874,15 @@\n provide a more \u201ccorrect\u201d translation between them, but still with very low\n overhead and not using actual disk files.\n For example:\n In [48]: ds = xr.tutorial.open_dataset(\"air_temperature_gradient\")\n ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries\n exceeded with url: /pydata/xarray-data/raw/master/air_temperature_gradient.nc\n (Caused by NameResolutionError(\": Failed to resolve 'github.com' ([Errno -3] Temporary failure\n+0xffff5d0456d0>: Failed to resolve 'github.com' ([Errno -3] Temporary failure\n in name resolution)\"))\n \n \n In [49]: cubes = ncdata.iris_xarray.cubes_from_xarray(ds)\n NameError: name 'ncdata' is not defined\n \n \n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/user-guide/plotting.html", "source2": "./usr/share/doc/python-xarray-doc/html/user-guide/plotting.html", "unified_diff": "@@ -100,15 +100,15 @@\n In [3]: import matplotlib.pyplot as plt\n \n In [4]: import xarray as xr\n
\n \n For these examples we\u2019ll use the North American air temperature dataset.
\n In [5]: airtemps = xr.tutorial.open_dataset("air_temperature")\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff1ff57610>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff5d044690>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n In [6]: airtemps\n NameError: name 'airtemps' is not defined\n \n \n # Convert to celsius\n@@ -445,15 +445,15 @@\n \n # Apply a nonlinear transformation to one of the coords\n In [50]: b.coords["lat"] = np.log(b.coords["lat"])\n KeyError: 'lat'\n \n \n In [51]: b.plot()\n-Out[51]: [<matplotlib.lines.Line2D at 0xffff693ea990>]\n+Out[51]: [<matplotlib.lines.Line2D at 0xffff9259ce10>]\n
\n \n
\n \n \n \n Other types of plot\u00b6
\n@@ -857,117 +857,117 @@\n * y (y) float64 88B 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0\n * z (z) int64 32B 0 1 2 3\n * w (w) <U5 80B 'one' 'two' 'three' 'five'\n Attributes:\n units: Aunits\n \n In [99]: ds.A.plot.scatter(x="y")\n-Out[99]: <matplotlib.collections.PathCollection at 0xffff1fee3b60>\n+Out[99]: <matplotlib.collections.PathCollection at 0xffff9288b770>\n
Same plot can be displayed using the dataset:
\nIn [100]: ds.plot.scatter(x="y", y="A")\n-Out[100]: <matplotlib.collections.PathCollection at 0xffff6926e5d0>\n+Out[100]: <matplotlib.collections.PathCollection at 0xffff924a0a50>\n
Now suppose we want to scatter the A
DataArray against the B
DataArray
In [101]: ds.plot.scatter(x="A", y="B")\n-Out[101]: <matplotlib.collections.PathCollection at 0xffff6936f890>\n+Out[101]: <matplotlib.collections.PathCollection at 0xffff92590410>\n
The hue
kwarg lets you vary the color by variable value
In [102]: ds.plot.scatter(x="A", y="B", hue="w")\n-Out[102]: <matplotlib.collections.PathCollection at 0xffff69449bd0>\n+Out[102]: <matplotlib.collections.PathCollection at 0xffff92677110>\n
You can force a legend instead of a colorbar by setting add_legend=True, add_colorbar=False
.
In [103]: ds.plot.scatter(x="A", y="B", hue="w", add_legend=True, add_colorbar=False)\n-Out[103]: <matplotlib.collections.PathCollection at 0xffff48ea7890>\n+Out[103]: <matplotlib.collections.PathCollection at 0xffff698751d0>\n
In [104]: ds.plot.scatter(x="A", y="B", hue="w", add_legend=False, add_colorbar=True)\n-Out[104]: <matplotlib.collections.PathCollection at 0xffff48f9c7d0>\n+Out[104]: <matplotlib.collections.PathCollection at 0xffff69882350>\n
The markersize
kwarg lets you vary the point\u2019s size by variable value.\n You can additionally pass size_norm
to control how the variable\u2019s values are mapped to point sizes.
In [105]: ds.plot.scatter(x="A", y="B", hue="y", markersize="z")\n-Out[105]: <matplotlib.collections.PathCollection at 0xffff5a353ed0>\n+Out[105]: <matplotlib.collections.PathCollection at 0xffff8353cf50>\n
The z
kwarg lets you plot the data along the z-axis as well.
In [106]: ds.plot.scatter(x="A", y="B", z="z", hue="y", markersize="x")\n-Out[106]: <mpl_toolkits.mplot3d.art3d.Path3DCollection at 0xffff48f68910>\n+Out[106]: <mpl_toolkits.mplot3d.art3d.Path3DCollection at 0xffff69854910>\n
Faceting is also possible
\nIn [107]: ds.plot.scatter(x="A", y="B", hue="y", markersize="x", row="x", col="w")\n-Out[107]: <xarray.plot.facetgrid.FacetGrid at 0xffff1fee0050>\n+Out[107]: <xarray.plot.facetgrid.FacetGrid at 0xffff5cfd4c20>\n
And adding the z-axis
\nIn [108]: ds.plot.scatter(x="A", y="B", z="z", hue="y", markersize="x", row="x", col="w")\n-Out[108]: <xarray.plot.facetgrid.FacetGrid at 0xffff68bbd450>\n+Out[108]: <xarray.plot.facetgrid.FacetGrid at 0xffff91df1450>\n
For more advanced scatter plots, we recommend converting the relevant data variables\n to a pandas DataFrame and using the extensive plotting capabilities of seaborn
.
Visualizing vector fields is supported with quiver plots:
\nIn [109]: ds.isel(w=1, z=1).plot.quiver(x="x", y="y", u="A", v="B")\n-Out[109]: <matplotlib.quiver.Quiver at 0xffff1fa441a0>\n+Out[109]: <matplotlib.quiver.Quiver at 0xffff5cc66ba0>\n
where u
and v
denote the x and y direction components of the arrow vectors. Again, faceting is also possible:
In [110]: ds.plot.quiver(x="x", y="y", u="A", v="B", col="w", row="z", scale=4)\n-Out[110]: <xarray.plot.facetgrid.FacetGrid at 0xffff689bcf50>\n+Out[110]: <xarray.plot.facetgrid.FacetGrid at 0xffff91becf50>\n
scale
is required for faceted quiver plots.\n The scale determines the number of data units per arrow length unit, i.e. a smaller scale parameter makes the arrow longer.
Visualizing vector fields is also supported with streamline plots:
\nIn [111]: ds.isel(w=1, z=1).plot.streamplot(x="x", y="y", u="A", v="B")\n-Out[111]: <matplotlib.collections.LineCollection at 0xffff6850f4d0>\n+Out[111]: <matplotlib.collections.LineCollection at 0xffff9173f4d0>\n
where u
and v
denote the x and y direction components of the vectors tangent to the streamlines.\n Again, faceting is also possible:
In [112]: ds.plot.streamplot(x="x", y="y", u="A", v="B", col="w", row="z")\n-Out[112]: <xarray.plot.facetgrid.FacetGrid at 0xffff1f9a4050>\n+Out[112]: <xarray.plot.facetgrid.FacetGrid at 0xffff5d605220>\n
To follow this section you\u2019ll need to have Cartopy installed and working.
\nThis script will plot the air temperature on a map.
\nIn [113]: import cartopy.crs as ccrs\n \n In [114]: air = xr.tutorial.open_dataset("air_temperature").air\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff68b2cb90>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff91d58b90>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n In [115]: p = air.isel(time=0).plot(\n .....: subplot_kws=dict(projection=ccrs.Orthographic(-80, 35), facecolor="gray"),\n .....: transform=ccrs.PlateCarree(),\n .....: )\n .....: \n@@ -1024,24 +1024,24 @@\n In [121]: import xarray.plot as xplt\n \n In [122]: da = xr.DataArray(range(5))\n \n In [123]: fig, axs = plt.subplots(ncols=2, nrows=2)\n \n In [124]: da.plot(ax=axs[0, 0])\n-Out[124]: [<matplotlib.lines.Line2D at 0xffff67f99a90>]\n+Out[124]: [<matplotlib.lines.Line2D at 0xffff911c9a90>]\n \n In [125]: da.plot.line(ax=axs[0, 1])\n-Out[125]: [<matplotlib.lines.Line2D at 0xffff67f99bd0>]\n+Out[125]: [<matplotlib.lines.Line2D at 0xffff911c9bd0>]\n \n In [126]: xplt.plot(da, ax=axs[1, 0])\n-Out[126]: [<matplotlib.lines.Line2D at 0xffff67f99d10>]\n+Out[126]: [<matplotlib.lines.Line2D at 0xffff911c9d10>]\n \n In [127]: xplt.line(da, ax=axs[1, 1])\n-Out[127]: [<matplotlib.lines.Line2D at 0xffff67f99e50>]\n+Out[127]: [<matplotlib.lines.Line2D at 0xffff911c9e50>]\n \n In [128]: plt.tight_layout()\n \n In [129]: plt.draw()\n
\n \n
\n@@ -1091,15 +1091,15 @@\n
The plot will produce an image corresponding to the values of the array.\n Hence the top left pixel will be a different color than the others.\n Before reading on, you may want to look at the coordinates and\n think carefully about what the limits, labels, and orientation for\n each of the axes should be.
\nIn [134]: a.plot()\n-Out[134]: <matplotlib.collections.QuadMesh at 0xffff68009310>\n+Out[134]: <matplotlib.collections.QuadMesh at 0xffff91235310>\n
It may seem strange that\n the values on the y axis are decreasing with -0.5 on the top. This is because\n the pixels are centered over their coordinates, and the\n@@ -1122,57 +1122,57 @@\n .....: np.arange(20).reshape(4, 5),\n .....: dims=["y", "x"],\n .....: coords={"lat": (("y", "x"), lat), "lon": (("y", "x"), lon)},\n .....: )\n .....: \n \n In [139]: da.plot.pcolormesh(x="lon", y="lat")\n-Out[139]: <matplotlib.collections.QuadMesh at 0xffff68629590>\n+Out[139]: <matplotlib.collections.QuadMesh at 0xffff91851590>\n
Note that in this case, xarray still follows the pixel centered convention.\n This might be undesirable in some cases, for example when your data is defined\n on a polar projection (GH781). This is why the default is to not follow\n this convention when plotting on a map:
\nIn [140]: import cartopy.crs as ccrs\n \n In [141]: ax = plt.subplot(projection=ccrs.PlateCarree())\n \n In [142]: da.plot.pcolormesh(x="lon", y="lat", ax=ax)\n-Out[142]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xffff680dac10>\n+Out[142]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xffff91306c10>\n \n In [143]: ax.scatter(lon, lat, transform=ccrs.PlateCarree())\n-Out[143]: <matplotlib.collections.PathCollection at 0xffff6813ca50>\n+Out[143]: <matplotlib.collections.PathCollection at 0xffff91364a50>\n \n In [144]: ax.coastlines()\n-Out[144]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffff1fee0590>\n+Out[144]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffff5cc67230>\n \n In [145]: ax.gridlines(draw_labels=True)\n-Out[145]: <cartopy.mpl.gridliner.Gridliner at 0xffff21e51d30>\n+Out[145]: <cartopy.mpl.gridliner.Gridliner at 0xffff5cc64ad0>\n
You can however decide to infer the cell boundaries and use the\n infer_intervals
keyword:
In [146]: ax = plt.subplot(projection=ccrs.PlateCarree())\n \n In [147]: da.plot.pcolormesh(x="lon", y="lat", ax=ax, infer_intervals=True)\n-Out[147]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xffff67c9f250>\n+Out[147]: <cartopy.mpl.geocollection.GeoQuadMesh at 0xffff90ecf250>\n \n In [148]: ax.scatter(lon, lat, transform=ccrs.PlateCarree())\n-Out[148]: <matplotlib.collections.PathCollection at 0xffff68947c50>\n+Out[148]: <matplotlib.collections.PathCollection at 0xffff91b77c50>\n \n In [149]: ax.coastlines()\n-Out[149]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffff68947b10>\n+Out[149]: <cartopy.mpl.feature_artist.FeatureArtist at 0xffff91b77b10>\n \n In [150]: ax.gridlines(draw_labels=True)\n-Out[150]: <cartopy.mpl.gridliner.Gridliner at 0xffff68947390>\n+Out[150]: <cartopy.mpl.gridliner.Gridliner at 0xffff91b77390>\n
Note
\nThe data model of xarray does not support datasets with cell boundaries\n@@ -1180,26 +1180,26 @@\n outside the xarray framework.
\nOne can also make line plots with multidimensional coordinates. In this case, hue
must be a dimension name, not a coordinate name.
In [151]: f, ax = plt.subplots(2, 1)\n \n In [152]: da.plot.line(x="lon", hue="y", ax=ax[0])\n Out[152]: \n-[<matplotlib.lines.Line2D at 0xffff67c6e210>,\n- <matplotlib.lines.Line2D at 0xffff67c6e0d0>,\n- <matplotlib.lines.Line2D at 0xffff67c6df90>,\n- <matplotlib.lines.Line2D at 0xffff67c6e5d0>]\n+[<matplotlib.lines.Line2D at 0xffff90e9a210>,\n+ <matplotlib.lines.Line2D at 0xffff90e9a0d0>,\n+ <matplotlib.lines.Line2D at 0xffff90e99f90>,\n+ <matplotlib.lines.Line2D at 0xffff90e9a5d0>]\n \n In [153]: da.plot.line(x="lon", hue="x", ax=ax[1])\n Out[153]: \n-[<matplotlib.lines.Line2D at 0xffff68712ad0>,\n- <matplotlib.lines.Line2D at 0xffff68712990>,\n- <matplotlib.lines.Line2D at 0xffff68712850>,\n- <matplotlib.lines.Line2D at 0xffff68712e90>,\n- <matplotlib.lines.Line2D at 0xffff687134d0>]\n+[<matplotlib.lines.Line2D at 0xffff9193ead0>,\n+ <matplotlib.lines.Line2D at 0xffff9193e990>,\n+ <matplotlib.lines.Line2D at 0xffff9193e850>,\n+ <matplotlib.lines.Line2D at 0xffff9193ee90>,\n+ <matplotlib.lines.Line2D at 0xffff9193f4d0>]\n
Whilst coarsen
is normally used for reducing your data\u2019s resolution by applying a reduction function\n (see the page on computation),\n it can also be used to reorganise your data without applying a computation via construct()
.
Taking our example tutorial air temperature dataset over the Northern US
\nIn [56]: air = xr.tutorial.open_dataset("air_temperature")["air"]\n-ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff67c6d450>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n+ConnectionError: HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: /pydata/xarray-data/raw/master/air_temperature.nc (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0xffff90e99450>: Failed to resolve 'github.com' ([Errno -3] Temporary failure in name resolution)"))\n \n \n In [57]: air.isel(time=0).plot(x="lon", y="lat")\n NameError: name 'air' is not defined\n
To see an example of what each of these strategies might produce, you can call one followed by the .example()
method,\n which is a general hypothesis method valid for all strategies.
In [2]: import xarray.testing.strategies as xrst\n \n In [3]: xrst.variables().example()\n Out[3]: \n-<xarray.Variable (b\u017b\u0136: 1)> Size: 16B\n-array([inf-2.225e-313j])\n-Attributes:\n- : False\n- \u0114: [[b'\\xc6\\xcb#']\\n [b'\\x1e\\xb1\\xaf']]\n- NaN: ['\\x08\\x86\u00f8s\u00af\\x94\u00c3\\x07\\x81' '\\U0008e3dc\u00e7\u00fe\u00eb\\U000caea3\\x17\\U000e0...\n+<xarray.Variable (\u00cc\u0143: 1, \u0152\u0116: 3, \u017dO: 5)> Size: 120B\n+array([[[-5.000e-01, -3.541e+15, -5.000e-01, -5.000e-01, -5.000e-01],\n+ [-5.000e-01, -5.000e-01, -5.000e-01, -5.000e-01, -5.000e-01],\n+ [-5.000e-01, -6.104e-05, -5.000e-01, -5.000e-01, -5.000e-01]]], shape=(1, 3, 5))\n \n In [4]: xrst.variables().example()\n Out[4]: \n <xarray.Variable (0: 1)> Size: 1B\n-array([0], dtype=int8)\n+array([0], dtype=uint8)\n \n In [5]: xrst.variables().example()\n Out[5]: \n-<xarray.Variable (7: 6)> Size: 24B\n-array([ 39788, 12523, 1851887664, 36979, 3514887451, 74], dtype=uint32)\n+<xarray.Variable (2Z\u017b\u017eu: 4, \u017c: 2)> Size: 64B\n+array([[ 5094675196065995282, 32272],\n+ [ 5094675196065995282, 5094675196065995282],\n+ [ 5094675196065995282, 5094675196065995282],\n+ [ 5094675196065995282, 12636881898743088449]], dtype=uint64)\n+Attributes:\n+ : {'': '\u00c4', '\u017d\u0160\u017f\u017d\u00ee': '\u0114\u0130tg\u00f0', 'false': False, '\u0109\u0119\u017e\u017f\u012c': array([b'\\...\n
You can see that calling .example()
multiple times will generate different examples, giving you an idea of the wide\n range of data that the xarray strategies can generate.
In your tests however you should not use .example()
- instead you should parameterize your tests with the\n hypothesis.given()
decorator:
In [6]: from hypothesis import given\n@@ -125,102 +128,115 @@\n Xarray\u2019s strategies can accept other strategies as arguments, allowing you to customise the contents of the generated\n examples.
\n # generate a Variable containing an array with a complex number dtype, but all other details still arbitrary\n In [8]: from hypothesis.extra.numpy import complex_number_dtypes\n \n In [9]: xrst.variables(dtype=complex_number_dtypes()).example()\n Out[9]: \n-<xarray.Variable (0: 1)> Size: 8B\n-array([0.+0.j], dtype=complex64)\n+<xarray.Variable (C\u013a\u017b\u00bd\u00b3: 1, \u00b5\u010e\u00c9\u00c8: 5, \u012e\u00ee0E: 5)> Size: 400B\n+array([[[-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j]]],\n+ shape=(1, 5, 5))\n+Attributes:\n+ \u00f0\u00c2\u017d\u017cL: None\n+ \u0150: ['\u00ef' '\u00ae-\\x92\\U0001bd3d\\U0010ba70vW\\U000fda5e']\n+ : ['\u00d80\\U0001b807']\n
\n \n This also works with custom strategies, or strategies defined in other packages.\n For example you could imagine creating a chunks
strategy to specify particular chunking patterns for a dask-backed array.
\n \n \n Fixing Arguments\u00b6
\n If you want to fix one aspect of the data structure, whilst allowing variation in the generated examples\n over all other aspects, then use hypothesis.strategies.just()
.
\n In [10]: import hypothesis.strategies as st\n \n # Generates only variable objects with dimensions ["x", "y"]\n In [11]: xrst.variables(dims=st.just(["x", "y"])).example()\n Out[11]: \n-<xarray.Variable (x: 6, y: 6)> Size: 72B\n-array([[ -0., -0., -0., -0., -0., -0.],\n- [ -0., -0., -0., -0., -0., -0.],\n- [ -0., -0., -0., -0., -0., -0.],\n- [ -0., -0., -0., -0., -inf, -0.],\n- [ -0., -0., -0., -0., -0., -0.],\n- [ 0., -0., -0., -0., -0., 0.]], shape=(6, 6), dtype=float16)\n+<xarray.Variable (x: 5, y: 6)> Size: 60B\n+array([[ -6591, -6591, -6591, -6591, -6591, -6591],\n+ [ -6591, -6591, -6591, -6591, -6591, -6591],\n+ [ -6591, -6591, -6591, -6591, -6591, -6591],\n+ [-31791, -6591, 8074, 24027, -6591, -6591],\n+ [ -6591, -6591, -6591, -6591, -6591, -6591]], shape=(5, 6), dtype=int16)\n+Attributes:\n+ \u00c0: {}\n+ : {}\n+ wJ: {'\u0163E': False}\n
\n \n (This is technically another example of chaining strategies - hypothesis.strategies.just()
is simply a\n special strategy that just contains a single example.)
\n To fix the length of dimensions you can instead pass dims
as a mapping of dimension names to lengths\n (i.e. following xarray objects\u2019 .sizes()
property), e.g.
\n # Generates only variables with dimensions ["x", "y"], of lengths 2 & 3 respectively\n In [12]: xrst.variables(dims=st.just({"x": 2, "y": 3})).example()\n Out[12]: \n-<xarray.Variable (x: 2, y: 3)> Size: 48B\n-array([[nan, nan, nan],\n- [nan, nan, nan]])\n+<xarray.Variable (x: 2, y: 3)> Size: 96B\n+array([[6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j],\n+ [6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j]])\n Attributes:\n- : {'N\u00de\u00fa\u00d6\u017c': False, '': '\u017f\u00bc', '\u017b\u017d\u00c0\u00cc': None, '\u00f0\u0161\u017e\u0152': array([b'\\xa4\\...\n+ \u00e4\u0144\u017eI\u00fa: {'\u0127\u0159\u0155': array([5438830300534694632, 29437], dtype...\n+ \u00f8\u013cFw\u017b: {'\u017d\u00c1\u00d4z\u00ee': array([[ 9963],\\n [26699]], dtype='>i2'), '': N...\n
\n \n You can also use this to specify that you want examples which are missing some part of the data structure, for instance
\n # Generates a Variable with no attributes\n In [13]: xrst.variables(attrs=st.just({})).example()\n Out[13]: \n-<xarray.Variable (0: 1)> Size: 1B\n-array([0], dtype=uint8)\n+<xarray.Variable (\u017b\u0160\u017f: 2)> Size: 16B\n+array([ 2.22e-16+2.j , -6.66e+16+0.5j], dtype=complex64)\n
\n \n Through a combination of chaining strategies and fixing arguments, you can specify quite complicated requirements on the\n objects your chained strategy will generate.
\n In [14]: fixed_x_variable_y_maybe_z = st.fixed_dictionaries(\n ....: {"x": st.just(2), "y": st.integers(3, 4)}, optional={"z": st.just(2)}\n ....: )\n ....: \n \n In [15]: fixed_x_variable_y_maybe_z.example()\n-Out[15]: {'x': 2, 'y': 3, 'z': 2}\n+Out[15]: {'x': 2, 'y': 3}\n \n In [16]: special_variables = xrst.variables(dims=fixed_x_variable_y_maybe_z)\n \n In [17]: special_variables.example()\n Out[17]: \n-<xarray.Variable (x: 2, y: 3)> Size: 96B\n-array([[-9.007e+015 -infj, 2.225e-313+5.000e-01j, -1.900e+000+2.000e+00j],\n- [ inf-1.175e-38j, 2.225e-311 +nanj, -1.000e+007-1.401e-45j]])\n+<xarray.Variable (x: 2, y: 4)> Size: 16B\n+array([[-32538, -28724, -22728, -22319],\n+ [ 31291, 23956, -32538, -16864]], dtype=int16)\n Attributes:\n- \u017c\u0160\u013a\u017c: \u017e\n- j\u017d\u015f: \u010e\n- : [1.175e-38 1.000e+00]\n- \u017c\u017c\u017d\u00bc\u0102: True\n- \u017c: \n+ \u00e4PD\u015b: [[ nan -1.00e+00]\\n [ inf 5.96e-08]]\n+ : None\n+ \u015c\u017fU\u0151\u017d: [[-10000000.]]\n+ \u0143: ['-25252734927764514-07-14' '-17372090436313634-11-02']\n+ \u017b: None\n+ \u0111\u00b5\u0134: False\n+ \u013aH: None\n+ \u010a\u00b5: \u00f5\u017ff\u00c1\u0130\n+ D: [['\u00fc\\U000ed619a\ud81e\udeddVK\\U000a0ddf#[' '\\U00065b43\u00fc\u00b9']]\n \n In [18]: special_variables.example()\n Out[18]: \n-<xarray.Variable (x: 2, y: 4)> Size: 64B\n-array([[ 1.000e+000, nan, -2.220e-016, 1.192e-007],\n- [-1.000e-005, -2.810e-262, -1.798e+308, -1.798e+308]])\n+<xarray.Variable (x: 2, y: 3, z: 2)> Size: 96B\n+array([[[ 1, 48720],\n+ [ 1286412330485489918, 28459],\n+ [ 7557085660993304329, 8665727290046625246]],\n+\n+ [[14919071171053671319, 444790435],\n+ [ 30680, 89],\n+ [ 14063, 50]]], shape=(2, 3, 2), dtype=uint64)\n Attributes:\n- : [ 1 255]\n- NIL: [15103 24540]\n- Vb\u0118\u017c\u017f: True\n- \u015f: False\n- \u017fQ\u00dc\u00bc: True\n- \u0145\u017e\u0158\u0171: 6\u017c\u017d\n- Jh\u017e\u0167\u015f: None\n- \u017bN\u00d4\u015e\u0146: False\n- \u0154\u011d\u011d\u0136\u014e: True\n- \u014a\u010b\u017c\u017d\u00db: None\n- V\u00e0u: None\n+ \u013b\u0119\u016b\u00c4\u0151: [b'\\x02\\xad']\n+ \u015a: \u012a\u017d\u00c3\n
\n \n Here we have used one of hypothesis\u2019 built-in strategies hypothesis.strategies.fixed_dictionaries()
to create a\n strategy which generates mappings of dimension names to lengths (i.e. the size
of the xarray object we want).\n This particular strategy will always generate an x
dimension of length 2, and a y
dimension of\n length either 3 or 4, and will sometimes also generate a z
dimension of length 2.\n By feeding this strategy for dictionaries into the dims
argument of xarray\u2019s variables()
strategy,\n@@ -321,43 +337,45 @@\n ....: array_strategy_fn=xps.arrays,\n ....: dtype=xps.scalar_dtypes(),\n ....: )\n ....: \n \n In [32]: xp_variables.example()\n Out[32]: \n-<xarray.Variable (0: 1)> Size: 4B\n-array([0.], dtype=float32)\n+<xarray.Variable (\u00e4: 2)> Size: 32B\n+array([nan-6.104e-05j, 0.5 -infj])\n+Attributes:\n+ : {'\u00f4': '\u012d\u0104\u00c3\u0115'}\n
Another array API-compliant duck array library would replace the import, e.g. import cupy as cp
instead.
A common task when testing xarray user code is checking that your function works for all valid input dimensions.\n We can chain strategies to achieve this, for which the helper strategy unique_subset_of()
\n is useful.
It works for lists of dimension names
\nIn [33]: dims = ["x", "y", "z"]\n \n In [34]: xrst.unique_subset_of(dims).example()\n-Out[34]: []\n+Out[34]: ['y']\n \n In [35]: xrst.unique_subset_of(dims).example()\n Out[35]: ['y', 'x', 'z']\n
as well as for mappings of dimension names to sizes
\nIn [36]: dim_sizes = {"x": 2, "y": 3, "z": 4}\n \n In [37]: xrst.unique_subset_of(dim_sizes).example()\n-Out[37]: {'x': 2, 'z': 4}\n+Out[37]: {'x': 2, 'z': 4, 'y': 3}\n \n In [38]: xrst.unique_subset_of(dim_sizes).example()\n-Out[38]: {'y': 3}\n+Out[38]: {'x': 2}\n
This is useful because operations like reductions can be performed over any subset of the xarray object\u2019s dimensions.\n For example we can write a pytest test that tests that a reduction gives the expected result when applying that reduction\n along any possible valid subset of the Variable\u2019s dimensions.
\nimport numpy.testing as npt\n \n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -28,32 +28,35 @@\n To see an example of what each of these strategies might produce, you can call\n one followed by the .example() method, which is a general hypothesis method\n valid for all strategies.\n In [2]: import xarray.testing.strategies as xrst\n \n In [3]: xrst.variables().example()\n Out[3]:\n- Size: 16B\n-array([inf-2.225e-313j])\n-Attributes:\n- : False\n- \u0114: [[b'\\xc6\\xcb#']\\n [b'\\x1e\\xb1\\xaf']]\n- NaN: ['\\x08\\x86\u00f8s\u00af\\x94\u00c3\\x07\\x81'\n-'\\U0008e3dc\u00e7\u00fe\u00eb\\U000caea3\\x17\\U000e0...\n+ Size: 120B\n+array([[[-5.000e-01, -3.541e+15, -5.000e-01, -5.000e-01, -5.000e-01],\n+ [-5.000e-01, -5.000e-01, -5.000e-01, -5.000e-01, -5.000e-01],\n+ [-5.000e-01, -6.104e-05, -5.000e-01, -5.000e-01, -5.000e-01]]], shape=\n+(1, 3, 5))\n \n In [4]: xrst.variables().example()\n Out[4]:\n Size: 1B\n-array([0], dtype=int8)\n+array([0], dtype=uint8)\n \n In [5]: xrst.variables().example()\n Out[5]:\n- Size: 24B\n-array([ 39788, 12523, 1851887664, 36979, 3514887451, 74],\n-dtype=uint32)\n+ Size: 64B\n+array([[ 5094675196065995282, 32272],\n+ [ 5094675196065995282, 5094675196065995282],\n+ [ 5094675196065995282, 5094675196065995282],\n+ [ 5094675196065995282, 12636881898743088449]], dtype=uint64)\n+Attributes:\n+ : {'': '\u00c4', '\u017d\u0160\u017f\u017d\u00ee': '\u0114\u0130tg\u00f0', 'false': False, '\u0109\u0119\u017e\u017f\u012c': array(\n+[b'\\...\n You can see that calling .example() multiple times will generate different\n examples, giving you an idea of the wide range of data that the xarray\n strategies can generate.\n In your tests however you should not use .example() - instead you should\n parameterize your tests with the hypothesis.given() decorator:\n In [6]: from hypothesis import given\n In [7]: @given(xrst.variables())\n@@ -65,103 +68,122 @@\n customise the contents of the generated examples.\n # generate a Variable containing an array with a complex number dtype, but all\n other details still arbitrary\n In [8]: from hypothesis.extra.numpy import complex_number_dtypes\n \n In [9]: xrst.variables(dtype=complex_number_dtypes()).example()\n Out[9]:\n- Size: 8B\n-array([0.+0.j], dtype=complex64)\n+ Size: 400B\n+array([[[-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -\n+inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -\n+inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -\n+inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -\n+inf-10000000.j],\n+ [-inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -inf-10000000.j, -\n+inf-10000000.j]]],\n+ shape=(1, 5, 5))\n+Attributes:\n+ \u00f0\u00c2\u017d\u017cL: None\n+ \u0150: ['\u00ef' '\u00ae-\\x92\\U0001bd3d\\U0010ba70vW\\U000fda5e']\n+ : ['\u00d80\\U0001b807']\n This also works with custom strategies, or strategies defined in other\n packages. For example you could imagine creating a chunks strategy to specify\n particular chunking patterns for a dask-backed array.\n *\b**\b**\b**\b* F\bFi\bix\bxi\bin\bng\bg A\bAr\brg\bgu\bum\bme\ben\bnt\bts\bs_\b?\b\u00b6 *\b**\b**\b**\b*\n If you want to fix one aspect of the data structure, whilst allowing variation\n in the generated examples over all other aspects, then use\n hypothesis.strategies.just().\n In [10]: import hypothesis.strategies as st\n \n # Generates only variable objects with dimensions [\"x\", \"y\"]\n In [11]: xrst.variables(dims=st.just([\"x\", \"y\"])).example()\n Out[11]:\n- Size: 72B\n-array([[ -0., -0., -0., -0., -0., -0.],\n- [ -0., -0., -0., -0., -0., -0.],\n- [ -0., -0., -0., -0., -0., -0.],\n- [ -0., -0., -0., -0., -inf, -0.],\n- [ -0., -0., -0., -0., -0., -0.],\n- [ 0., -0., -0., -0., -0., 0.]], shape=(6, 6), dtype=float16)\n+ Size: 60B\n+array([[ -6591, -6591, -6591, -6591, -6591, -6591],\n+ [ -6591, -6591, -6591, -6591, -6591, -6591],\n+ [ -6591, -6591, -6591, -6591, -6591, -6591],\n+ [-31791, -6591, 8074, 24027, -6591, -6591],\n+ [ -6591, -6591, -6591, -6591, -6591, -6591]], shape=(5, 6),\n+dtype=int16)\n+Attributes:\n+ \u00c0: {}\n+ : {}\n+ wJ: {'\u0163E': False}\n (This is technically another example of chaining strategies -\n hypothesis.strategies.just() is simply a special strategy that just contains a\n single example.)\n To fix the length of dimensions you can instead pass dims as a mapping of\n dimension names to lengths (i.e. following xarray objects\u2019 .sizes() property),\n e.g.\n # Generates only variables with dimensions [\"x\", \"y\"], of lengths 2 & 3\n respectively\n In [12]: xrst.variables(dims=st.just({\"x\": 2, \"y\": 3})).example()\n Out[12]:\n- Size: 48B\n-array([[nan, nan, nan],\n- [nan, nan, nan]])\n+ Size: 96B\n+array([[6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j],\n+ [6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j, 6.671e+16+8.481e-125j]])\n Attributes:\n- : {'N\u00de\u00fa\u00d6\u017c': False, '': '\u017f\u00bc', '\u017b\u017d\u00c0\u00cc': None, '\u00f0\u0161\u017e\u0152': array(\n-[b'\\xa4\\...\n+ \u00e4\u0144\u017eI\u00fa: {'\u0127\u0159\u0155': array([5438830300534694632, 29437],\n+dtype...\n+ \u00f8\u013cFw\u017b: {'\u017d\u00c1\u00d4z\u00ee': array([[ 9963],\\n [26699]], dtype='>i2'), '':\n+N...\n You can also use this to specify that you want examples which are missing some\n part of the data structure, for instance\n # Generates a Variable with no attributes\n In [13]: xrst.variables(attrs=st.just({})).example()\n Out[13]:\n- Size: 1B\n-array([0], dtype=uint8)\n+ Size: 16B\n+array([ 2.22e-16+2.j , -6.66e+16+0.5j], dtype=complex64)\n Through a combination of chaining strategies and fixing arguments, you can\n specify quite complicated requirements on the objects your chained strategy\n will generate.\n In [14]: fixed_x_variable_y_maybe_z = st.fixed_dictionaries(\n ....: {\"x\": st.just(2), \"y\": st.integers(3, 4)}, optional={\"z\": st.just\n (2)}\n ....: )\n ....:\n \n In [15]: fixed_x_variable_y_maybe_z.example()\n-Out[15]: {'x': 2, 'y': 3, 'z': 2}\n+Out[15]: {'x': 2, 'y': 3}\n \n In [16]: special_variables = xrst.variables(dims=fixed_x_variable_y_maybe_z)\n \n In [17]: special_variables.example()\n Out[17]:\n- Size: 96B\n-array([[-9.007e+015 -infj, 2.225e-313+5.000e-01j, -\n-1.900e+000+2.000e+00j],\n- [ inf-1.175e-38j, 2.225e-311 +nanj, -1.000e+007-1.401e-\n-45j]])\n+ Size: 16B\n+array([[-32538, -28724, -22728, -22319],\n+ [ 31291, 23956, -32538, -16864]], dtype=int16)\n Attributes:\n- \u017c\u0160\u013a\u017c: \u017e\n- j\u017d\u015f: \u010e\n- : [1.175e-38 1.000e+00]\n- \u017c\u017c\u017d\u00bc\u0102: True\n- \u017c:\n+ \u00e4PD\u015b: [[ nan -1.00e+00]\\n [ inf 5.96e-08]]\n+ : None\n+ \u015c\u017fU\u0151\u017d: [[-10000000.]]\n+ \u0143: ['-25252734927764514-07-14' '-17372090436313634-11-02']\n+ \u017b: None\n+ \u0111\u00b5\u0134: False\n+ \u013aH: None\n+ \u010a\u00b5: \u00f5\u017ff\u00c1\u0130\n+ D: [['\u00fc\\U000ed619a\ud81e\udeddVK\\U000a0ddf#[' '\\U00065b43\u00fc\u00b9']]\n \n In [18]: special_variables.example()\n Out[18]:\n- Size: 64B\n-array([[ 1.000e+000, nan, -2.220e-016, 1.192e-007],\n- [-1.000e-005, -2.810e-262, -1.798e+308, -1.798e+308]])\n+ Size: 96B\n+array([[[ 1, 48720],\n+ [ 1286412330485489918, 28459],\n+ [ 7557085660993304329, 8665727290046625246]],\n+\n+ [[14919071171053671319, 444790435],\n+ [ 30680, 89],\n+ [ 14063, 50]]], shape=(2, 3, 2),\n+dtype=uint64)\n Attributes:\n- : [ 1 255]\n- NIL: [15103 24540]\n- Vb\u0118\u017c\u017f: True\n- \u015f: False\n- \u017fQ\u00dc\u00bc: True\n- \u0145\u017e\u0158\u0171: 6\u017c\u017d\n- Jh\u017e\u0167\u015f: None\n- \u017bN\u00d4\u015e\u0146: False\n- \u0154\u011d\u011d\u0136\u014e: True\n- \u014a\u010b\u017c\u017d\u00db: None\n- V\u00e0u: None\n+ \u013b\u0119\u016b\u00c4\u0151: [b'\\x02\\xad']\n+ \u015a: \u012a\u017d\u00c3\n Here we have used one of hypothesis\u2019 built-in strategies\n hypothesis.strategies.fixed_dictionaries() to create a strategy which generates\n mappings of dimension names to lengths (i.e. the size of the xarray object we\n want). This particular strategy will always generate an x dimension of length\n 2, and a y dimension of length either 3 or 4, and will sometimes also generate\n a z dimension of length 2. By feeding this strategy for dictionaries into the\n dims argument of xarray\u2019s variables() strategy, we can generate arbitrary\n@@ -255,38 +277,40 @@\n ....: array_strategy_fn=xps.arrays,\n ....: dtype=xps.scalar_dtypes(),\n ....: )\n ....:\n \n In [32]: xp_variables.example()\n Out[32]:\n- Size: 4B\n-array([0.], dtype=float32)\n+ Size: 32B\n+array([nan-6.104e-05j, 0.5 -infj])\n+Attributes:\n+ : {'\u00f4': '\u012d\u0104\u00c3\u0115'}\n Another array API-compliant duck array library would replace the import, e.g.\n import cupy as cp instead.\n *\b**\b**\b**\b* T\bTe\bes\bst\bti\bin\bng\bg o\bov\bve\ber\br S\bSu\bub\bbs\bse\bet\bts\bs o\bof\bf D\bDi\bim\bme\ben\bns\bsi\bio\bon\bns\bs_\b?\b\u00b6 *\b**\b**\b**\b*\n A common task when testing xarray user code is checking that your function\n works for all valid input dimensions. We can chain strategies to achieve this,\n for which the helper strategy unique_subset_of() is useful.\n It works for lists of dimension names\n In [33]: dims = [\"x\", \"y\", \"z\"]\n \n In [34]: xrst.unique_subset_of(dims).example()\n-Out[34]: []\n+Out[34]: ['y']\n \n In [35]: xrst.unique_subset_of(dims).example()\n Out[35]: ['y', 'x', 'z']\n as well as for mappings of dimension names to sizes\n In [36]: dim_sizes = {\"x\": 2, \"y\": 3, \"z\": 4}\n \n In [37]: xrst.unique_subset_of(dim_sizes).example()\n-Out[37]: {'x': 2, 'z': 4}\n+Out[37]: {'x': 2, 'z': 4, 'y': 3}\n \n In [38]: xrst.unique_subset_of(dim_sizes).example()\n-Out[38]: {'y': 3}\n+Out[38]: {'x': 2}\n This is useful because operations like reductions can be performed over any\n subset of the xarray object\u2019s dimensions. For example we can write a pytest\n test that tests that a reduction gives the expected result when applying that\n reduction along any possible valid subset of the Variable\u2019s dimensions.\n import numpy.testing as npt\n \n \n"}]}, {"source1": "./usr/share/doc/python-xarray-doc/html/whats-new.html", "source2": "./usr/share/doc/python-xarray-doc/html/whats-new.html", "unified_diff": "@@ -8191,15 +8191,15 @@\n New xray.Dataset.where
method for masking xray objects according\n to some criteria. This works particularly well with multi-dimensional data:
\n In [45]: ds = xray.Dataset(coords={"x": range(100), "y": range(100)})\n \n In [46]: ds["distance"] = np.sqrt(ds.x**2 + ds.y**2)\n \n In [47]: ds.distance.where(ds.distance < 100).plot()\n-Out[47]: <matplotlib.collections.QuadMesh at 0xffff68711950>\n+Out[47]: <matplotlib.collections.QuadMesh at 0xffff9193e350>\n
\n \n
\n \n \n Added new methods xray.DataArray.diff
and xray.Dataset.diff
\n for finite difference calculations along a given axis.
\n", "details": [{"source1": "html2text {}", "source2": "html2text {}", "unified_diff": "@@ -5286,15 +5286,15 @@\n * New xray.Dataset.where method for masking xray objects according to some\n criteria. This works particularly well with multi-dimensional data:\n In [45]: ds = xray.Dataset(coords={\"x\": range(100), \"y\": range(100)})\n \n In [46]: ds[\"distance\"] = np.sqrt(ds.x**2 + ds.y**2)\n \n In [47]: ds.distance.where(ds.distance < 100).plot()\n- Out[47]: \n+ Out[47]: \n _\b[_\b__\bi_\bm_\ba_\bg_\be_\bs_\b/_\bw_\bh_\be_\br_\be_\b__\be_\bx_\ba_\bm_\bp_\bl_\be_\b._\bp_\bn_\bg_\b]\n * Added new methods xray.DataArray.diff and xray.Dataset.diff for finite\n difference calculations along a given axis.\n * New xray.DataArray.to_masked_array convenience method for returning a\n numpy.ma.MaskedArray.\n In [48]: da = xray.DataArray(np.random.random_sample(size=(5, 4)))\n \n"}]}]}]}]}]}