{ "cells": [ { "cell_type": "markdown", "id": "8c281cd5b38713c6", "metadata": {}, "source": "# Time series data - the basics" }, { "metadata": {}, "cell_type": "markdown", "source": "This tutorial focuses on working with basic time series data management and analysis using `plans`.", "id": "3536ad34a268ba66" }, { "cell_type": "markdown", "id": "bf7d86c6e4cdd29b", "metadata": {}, "source": [ "## Notebook setup\n", "\n", "For users running this tutorial as a Jupyter Notebook, this cell must be executed first:" ] }, { "cell_type": "code", "id": "initial_id", "metadata": {}, "source": [ "import sys\n", "from pathlib import Path\n", "import pprint\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "\n", "# Install `plans` in `google.colab`.\n", "# Use `pip install plans` for other environments.\n", "\n", "if \"google.colab\" in sys.modules:\n", " import os\n", " os.system(f\"{sys.executable} -m pip install -q plans\")\n", "\n", "# This avoids warnings related to uninstalled fonts\n", "import logging\n", "# Set the matplotlib font manager logger to only show errors (hides warnings)\n", "logging.getLogger('matplotlib.font_manager').setLevel(logging.ERROR)\n", "\n", "# define output folder\n", "OUTPUT_DIR = Path(\"outputs/time-series\")\n", "OUTPUT_DIR.mkdir(parents=True, exist_ok=True)\n", "print(f\"Outputs will be saved to: ./{OUTPUT_DIR}\")" ], "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": [ "## The `TimeSeries` object\n", "\n", "The `TimeSeries` object is a very primitive class that lives under `plans.datasets` module.\n", "This object is a child from the `Univar` object that lives in `plans.analyst` module.\n", "\n", "The `TimeSeries` stores all core methods for working with time series, incluing standardization." ], "id": "4f68d69935fd52c2" }, { "cell_type": "markdown", "id": "1dbe61130c1edf08", "metadata": {}, "source": "Import `TimeSeries` object:" }, { "cell_type": "code", "id": "313c51eb1ab99942", "metadata": {}, "source": "from plans.datasets import TimeSeries", "outputs": [], "execution_count": null }, { "cell_type": "markdown", "id": "146999734e415289", "metadata": {}, "source": "Create an instance of the `TimeSeries`:" }, { "metadata": {}, "cell_type": "code", "source": "ts = TimeSeries(name=\"Testing\", alias=\"tst\")", "id": "46762aeec6534038", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "Check out the `ts` variable type:", "id": "733fddc2904d407c" }, { "metadata": {}, "cell_type": "code", "source": "print(type(ts))", "id": "151454dca15b1994", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "Attributes work the same way as in the `Univar` object:", "id": "5a3b26d3a548c41" }, { "metadata": {}, "cell_type": "code", "source": [ "ts.units = \"cm\"\n", "ts.description = \"Just a tutorial\"\n", "print(ts)" ], "id": "a1b22adb89d091e3", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": [ "## Working with perfect data\n", "\n", "A perfect data for time series means that there are no time gaps, so it does not need standardization." ], "id": "d551819e06105839" }, { "metadata": {}, "cell_type": "markdown", "source": [ "### Create synthetic time series data\n", "\n", "Lets first make a perfect time series using `.make_synthetic_tsn()` method and save it to a CSV file.\n", "This method makes a Trend-Seasonality-Noise archetype time-series:" ], "id": "121250c4864af17d" }, { "metadata": {}, "cell_type": "code", "source": [ "# make synthetic TSN (Trend-Seasonality-Noise) time-series\n", "df = TimeSeries.make_synthetic_tsn(\n", " start=\"2020-01-01\",\n", " end=\"2026-01-01\",\n", " base=100,\n", " freq=\"3h\",\n", " trend=0.002,\n", " noise_sd=3.0,\n", " amplitude=50,\n", " seasonal_period=\"YS\",\n", " minor_amplitude=20,\n", " minor_seasonal_period=\"D\"\n", ")\n", "\n", "# Export CSV file\n", "file_csv = OUTPUT_DIR / \"time_series.csv\"\n", "df.to_csv(file_csv, sep=\";\", index=\"False\")\n", "print(f\"Saved to: {file_csv}\")" ], "id": "d9b3bfe3f0592b3d", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "The whole time series looks like this:", "id": "c6e6c32c89a5b687" }, { "metadata": {}, "cell_type": "code", "source": [ "# get simple visualization\n", "plt.plot(df['datetime'], df['level'])\n", "plt.ylim([0, 250])\n", "plt.show()" ], "id": "f51af2392847d789", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "A zoom to the month scale:", "id": "af9c95cd5eb4b0f9" }, { "metadata": {}, "cell_type": "code", "source": [ "# get simple visualization\n", "plt.plot(df['datetime'], df['level'])\n", "plt.xlim(pd.to_datetime([\"2020-01-01 00:00:00\", \"2020-02-01 00:00:00\"]))\n", "plt.show()" ], "id": "9eaabacd51fcac04", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "### Loading data from the CSV file", "id": "6f5ef4374e83f9e0" }, { "metadata": {}, "cell_type": "markdown", "source": "Call the `.load_data()` method for loading from CSV file:", "id": "e9a09516beed41e0" }, { "metadata": {}, "cell_type": "code", "source": [ "# reset the ts variable\n", "ts = TimeSeries(name=\"Testing\", alias=\"tst\")\n", "\n", "ts.load_data(\n", " file_data=file_csv, # file path\n", " input_dtfield=\"datetime\", # name of datetime field\n", " input_varfield=\"level\", # name of variable\n", " in_sep=\";\", # input separator\n", " filter_dates=[\"2020-01-01\", \"2020-03-01\"] # filter dates\n", ")" ], "id": "380bd091b6e68a7", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "Data is stored in the `.data` attribute:", "id": "892d5f871acca572" }, { "metadata": {}, "cell_type": "code", "source": "ts.data", "id": "939501d7b274b49a", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Visualizations", "id": "a37d27fe4746e966" }, { "metadata": {}, "cell_type": "markdown", "source": [ "Most `plans.` objects comes with built-in methods for getting visualizations, both inline and figure output.\n", "\n", "```{seealso}\n", "Check out more about visualizations on the {doc}`Visualizations - the basics ` tutorial.\n", "```" ], "id": "3035b46f8c71f2a0" }, { "metadata": {}, "cell_type": "markdown", "source": "### Standard visualization", "id": "d2a32f4125bc6bc9" }, { "metadata": {}, "cell_type": "markdown", "source": "Get the standard visual using the `.view()` method", "id": "bd5a415075784b89" }, { "metadata": {}, "cell_type": "code", "source": "ts.view()", "id": "4b7cc7e6fe9f5a25", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "### Fine-tuning visual items", "id": "3e9f82e5c422ab92" }, { "metadata": {}, "cell_type": "markdown", "source": "Fine-tune plot specs by editing the `.view_specs` attribute dictionary", "id": "9d860f84ea7c23d3" }, { "metadata": {}, "cell_type": "code", "source": [ "# reset view_specs\n", "ts._set_view_specs()\n", "# edit specs\n", "# color of the main line\n", "ts.view_specs[\"color\"] = \"blue\"\n", "ts.view_specs[\"color_hist\"] = \"green\"\n", "\n", "# Style of line\n", "ts.view_specs[\"drawstyle\"] = \"steps-mid\"\n", "\n", "# Labels\n", "ts.view_specs[\"ylabel\"] = f\"Level ({ts.units})\"\n", "ts.view_specs[\"xlabel\"] = \"Date\"\n", "\n", "# Titles\n", "ts.view_specs[\"title\"] = \"Hello! This is a Tutorial!\"\n", "ts.view_specs[\"subtitle_data\"] = r\"$\\bf{a}$ Time Series\"\n", "ts.view_specs[\"subtitle_hist\"] = r\"$\\bf{b}$ Histogram\"\n", "ts.view_specs[\"subtitle_cdf\"] = r\"$\\bf{c}$ CDF\"\n", "\n", "# Axis\n", "ts.view_specs[\"range\"] = [0, 200.0]\n", "\n", "# Number of dates in the X axis\n", "ts.view_specs[\"n_dates\"] = 7\n", "\n", "# Call view() again\n", "ts.view()" ], "id": "13a2c6d9926d7d8d", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "### Layouts available", "id": "816470d03d4281d7" }, { "metadata": {}, "cell_type": "markdown", "source": "List available layouts:", "id": "72c4f5442e9f34cf" }, { "metadata": {}, "cell_type": "code", "source": "print(ts.layouts.keys())", "id": "cd7b81cef42c4903", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "ts.view_specs[\"layout\"] = \"full\"\n", "ts.view()" ], "id": "739db07f9a0709dc", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "ts.view_specs[\"layout\"] = \"mini\"\n", "ts.view()" ], "id": "1c9af03bb20b8063", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "ts.view_specs[\"layout\"] = \"simple\"\n", "ts.view()" ], "id": "81c60cac261d99e3", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "ts.view_specs[\"layout\"] = \"simple-shallow\"\n", "ts.view()" ], "id": "79c505c6dd600e56", "outputs": [], "execution_count": null } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }