{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# AM783: Applied Markov processes | Statistical or Monte Carlo estimation\n", "\n", "Hugo Touchette\n", "\n", "13 August 2020\n", "\n", "Last updated: 11 August 2022\n", "\n", "Python 3" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Magic command for vectorised figures\n", "%config InlineBackend.figure_format = 'svg'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Point estimator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We want to estimate the mean $E[R]$ of a sample of values known to be distributed from the Rayleigh distribution\n", "$$\n", "p(r) = r e^{-r^2/2}.\n", "$$\n", "For this, we generate a sample of $L$ values and calculate the estimator\n", "$$\n", "\\hat\\mu_L=\\frac{1}{L}\\sum_{i=1}^L X_i.\n", "$$\n", "By increasing $L$, the estimator should get closer to the theoretical expectation, which is known to be\n", "$$\n", "E[R] = \\int_0^\\infty r\\, p(r)\\, dr = \\sqrt{\\frac{\\pi}{2}}.\n", "$$" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Estimator value = 1.2525155110475363\n", "Theoretical value = 1.2533141373155001\n" ] } ], "source": [ "L = 10**6\n", "sample = np.random.rayleigh(1, L) # Using the built-in Rayleigh generator\n", "mean_est = np.mean(sample)\n", "print('Estimator value = ', mean_est)\n", "print('Theoretical value = ', np.sqrt(np.pi/2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Repeating the calculation, we indeed see that estimated values get close to the theoretical expectation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian error bars" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the sample used to obtain the mean estimate, we can also calculate the Gaussian error, as shown in the lecture notes." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Estimator value = 1.252569432127885 ± 0.0006548888077747249\n", "Absolute error = 0.0007447051876150468\n" ] } ], "source": [ "L = 10**6\n", "sample = np.random.rayleigh(1, L) # Using the built-in Rayleigh generator\n", "mean_est = np.mean(sample)\n", "mean_est_err = np.std(sample)/np.sqrt(L)\n", "print('Estimator value = ', mean_est, \"±\", mean_est_err)\n", "print('Absolute error = ', np.abs(mean_est-np.sqrt(np.pi/2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convergence analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An estimation should be repeated for increasing values of $L$ to show that it converges. The code next does this efficiently without recalculating the estimator and its error for each $L$." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "Lmax = 10**4\n", "\n", "# Empty lists to receive results\n", "mean_est_list = np.zeros(Lmax)\n", "mean_est_err_list = np.zeros(Lmax)\n", "\n", "# Initial values of sums\n", "s = 0.0\n", "v = 0.0\n", "\n", "for i in range(Lmax):\n", " r = np.random.rayleigh(1)\n", " s += r # Accumulate the RVs in sum to get mean estimator\n", " v += r**2 # Accumulate the RVs^2 to get the error bar\n", "\n", " mean_est = s/(i+1.0) # Mean estimator\n", " sec_mom_est = v/(i+1.0) # Second moment estimator\n", " mean_est_err = np.sqrt(sec_mom_est-mean_est**2) / np.sqrt(i+1.0) # Estimator standard deviation\n", "\n", " # Put results in lists\n", " mean_est_list[i] = mean_est\n", " mean_est_err_list[i] = mean_est_err\n", "\n", "# Plot estimator with error bar\n", "plt.errorbar(range(Lmax), mean_est_list, mean_est_err_list, errorevery=10)\n", "\n", "# Show theoretical value\n", "plt.plot(range(Lmax), np.zeros(Lmax)+np.sqrt(np.pi/2), 'k--')\n", "plt.xlabel(r'$L$')\n", "plt.ylabel(r'$\\hat\\mu_L$')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can show the error as a filled plot:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "Lmax = 10**3\n", "mean_est_list = np.zeros(Lmax)\n", "mean_est_err_list = np.zeros(Lmax)\n", "\n", "s = 0.0\n", "v = 0.0\n", "\n", "for i in range(Lmax):\n", " r = np.random.rayleigh(1)\n", " s += r\n", " v += r**2\n", "\n", " mean_est = s/(i+1.0)\n", " sec_mom_est = v/(i+1.0)\n", " mean_est_err = np.sqrt(sec_mom_est-mean_est**2)/np.sqrt(i+1.0)\n", "\n", " mean_est_list[i] = mean_est\n", " mean_est_err_list[i] = mean_est_err\n", "\n", "plt.plot(range(Lmax), mean_est_list, 'k-')\n", "plt.fill_between(range(Lmax), \n", " mean_est_list-mean_est_err_list, \n", " mean_est_list + mean_est_err_list,\n", " alpha=0.2, \n", " edgecolor='#1B2ACC', \n", " facecolor='#089FFF')\n", "plt.plot(range(Lmax), np.zeros(Lmax)+np.sqrt(np.pi/2), 'k--')\n", "plt.ylim([1.0, 1.5])\n", "plt.xlabel(r'$L$')\n", "plt.ylabel(r'$\\hat\\mu_L$')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }