{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# AM783: Applied Markov processes | Statistical or Monte Carlo estimation\n",
"\n",
"Hugo Touchette\n",
"\n",
"13 August 2020\n",
"\n",
"Last updated: 11 August 2022\n",
"\n",
"Python 3"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# Magic command for vectorised figures\n",
"%config InlineBackend.figure_format = 'svg'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Point estimator"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We want to estimate the mean $E[R]$ of a sample of values known to be distributed from the Rayleigh distribution\n",
"$$\n",
"p(r) = r e^{-r^2/2}.\n",
"$$\n",
"For this, we generate a sample of $L$ values and calculate the estimator\n",
"$$\n",
"\\hat\\mu_L=\\frac{1}{L}\\sum_{i=1}^L X_i.\n",
"$$\n",
"By increasing $L$, the estimator should get closer to the theoretical expectation, which is known to be\n",
"$$\n",
"E[R] = \\int_0^\\infty r\\, p(r)\\, dr = \\sqrt{\\frac{\\pi}{2}}.\n",
"$$"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Estimator value = 1.2525155110475363\n",
"Theoretical value = 1.2533141373155001\n"
]
}
],
"source": [
"L = 10**6\n",
"sample = np.random.rayleigh(1, L) # Using the built-in Rayleigh generator\n",
"mean_est = np.mean(sample)\n",
"print('Estimator value = ', mean_est)\n",
"print('Theoretical value = ', np.sqrt(np.pi/2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Repeating the calculation, we indeed see that estimated values get close to the theoretical expectation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Gaussian error bars"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From the sample used to obtain the mean estimate, we can also calculate the Gaussian error, as shown in the lecture notes."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Estimator value = 1.252569432127885 ± 0.0006548888077747249\n",
"Absolute error = 0.0007447051876150468\n"
]
}
],
"source": [
"L = 10**6\n",
"sample = np.random.rayleigh(1, L) # Using the built-in Rayleigh generator\n",
"mean_est = np.mean(sample)\n",
"mean_est_err = np.std(sample)/np.sqrt(L)\n",
"print('Estimator value = ', mean_est, \"±\", mean_est_err)\n",
"print('Absolute error = ', np.abs(mean_est-np.sqrt(np.pi/2)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Convergence analysis"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An estimation should be repeated for increasing values of $L$ to show that it converges. The code next does this efficiently without recalculating the estimator and its error for each $L$."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"Lmax = 10**4\n",
"\n",
"# Empty lists to receive results\n",
"mean_est_list = np.zeros(Lmax)\n",
"mean_est_err_list = np.zeros(Lmax)\n",
"\n",
"# Initial values of sums\n",
"s = 0.0\n",
"v = 0.0\n",
"\n",
"for i in range(Lmax):\n",
" r = np.random.rayleigh(1)\n",
" s += r # Accumulate the RVs in sum to get mean estimator\n",
" v += r**2 # Accumulate the RVs^2 to get the error bar\n",
"\n",
" mean_est = s/(i+1.0) # Mean estimator\n",
" sec_mom_est = v/(i+1.0) # Second moment estimator\n",
" mean_est_err = np.sqrt(sec_mom_est-mean_est**2) / np.sqrt(i+1.0) # Estimator standard deviation\n",
"\n",
" # Put results in lists\n",
" mean_est_list[i] = mean_est\n",
" mean_est_err_list[i] = mean_est_err\n",
"\n",
"# Plot estimator with error bar\n",
"plt.errorbar(range(Lmax), mean_est_list, mean_est_err_list, errorevery=10)\n",
"\n",
"# Show theoretical value\n",
"plt.plot(range(Lmax), np.zeros(Lmax)+np.sqrt(np.pi/2), 'k--')\n",
"plt.xlabel(r'$L$')\n",
"plt.ylabel(r'$\\hat\\mu_L$')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternatively, we can show the error as a filled plot:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"Lmax = 10**3\n",
"mean_est_list = np.zeros(Lmax)\n",
"mean_est_err_list = np.zeros(Lmax)\n",
"\n",
"s = 0.0\n",
"v = 0.0\n",
"\n",
"for i in range(Lmax):\n",
" r = np.random.rayleigh(1)\n",
" s += r\n",
" v += r**2\n",
"\n",
" mean_est = s/(i+1.0)\n",
" sec_mom_est = v/(i+1.0)\n",
" mean_est_err = np.sqrt(sec_mom_est-mean_est**2)/np.sqrt(i+1.0)\n",
"\n",
" mean_est_list[i] = mean_est\n",
" mean_est_err_list[i] = mean_est_err\n",
"\n",
"plt.plot(range(Lmax), mean_est_list, 'k-')\n",
"plt.fill_between(range(Lmax), \n",
" mean_est_list-mean_est_err_list, \n",
" mean_est_list + mean_est_err_list,\n",
" alpha=0.2, \n",
" edgecolor='#1B2ACC', \n",
" facecolor='#089FFF')\n",
"plt.plot(range(Lmax), np.zeros(Lmax)+np.sqrt(np.pi/2), 'k--')\n",
"plt.ylim([1.0, 1.5])\n",
"plt.xlabel(r'$L$')\n",
"plt.ylabel(r'$\\hat\\mu_L$')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}