Differences

This shows you the differences between two versions of the page.

--- courses:ast201:3 [2023/06/22 10:28] – [3. Probability distributions] asad
+++ courses:ast201:3 [2023/10/22 05:37] (current) – [2. Population and sample] asad
@@ Line 20: / Line 20: @@
 ^ Astronomer ^ Precision ^ Accuracy ^ Decision ^
-| A | Precise | Accurate | Evacuate |
+| A (cyan) | Precise | Accurate | Evacuate |
-| B | Precise | Inaccurate | Stay |
+| B (magenta) | Precise | Inaccurate | Stay |
-| C | Imprecise | Accurate | Uncertain |
+| C (yellow) | Imprecise | Accurate | Uncertain |
-| D | Imprecise | Inaccurate | Uncertain |
+| D (black) | Imprecise | Inaccurate | Uncertain |
 Systematic errors are more important and dangerous for astronomy than stochastic errors. Astronomers A and B have the same precision, but after all the measurements B realizes that her values differ from the values of everyone else and, hence, unlikely to be true. Another reason for her inaccuracy is that her deviation from others is much greater than her precision. Astronomer C sees a systematic errors in his stochastic errors because his values decrease with time predictably.
@@ Line 86: / Line 86: @@
 [[https://colab.research.google.com/drive/1Pc1gv27ywDHavjQiNsZESRY0nij-KHAh?usp=sharing|{{:courses:ast201:stars-iron.png?nolink|}}]]
-In this example, the velocities of 50 stars perpendicular to the Galactic plane are plotted as a bar-chart histogram. The stars are divided into two sets: 25 stars are iron-rich and the other 25 stars are iron-poor. The iron-poor stars have a greater dispersion of velocities as evident from the plot. The mean and the standard deviation are shown as vertical lines and shades, respectively.
+In this example, the velocities of 50 stars perpendicular to the [[uv:mw|Galactic plane]] are plotted as a bar-chart histogram. The stars are divided into two sets: 25 stars are iron-rich and the other 25 stars are iron-poor. The iron-poor stars have a greater dispersion of velocities as evident from the plot. The mean and the standard deviation are shown as vertical lines and shades, respectively.
 Standard deviation is sometimes better than variance in describing physical phenomena because $\sigma$ has the same unit as $\mu$, but the units are squared in $\sigma^2$ statistic. In this example, the variance among iron-rich stars is $57.25$ km$^2$ s$^{-2}$ whereas the standard deviation is $7.57$ km s$^{-1}$.
@@ Line 107: / Line 107: @@
 ===== - Probability distributions =====
+"The most important questions of life are, for the most part, really only problems of probability." --- Pierre-Simon Laplace
 Because astronomical observations always select a sample from a given population (e. g. mass of 100 stars from 100 billion), it is important that we understand how to construct a sample from a population.
-Imagine a jar full of metal balls with differing diameters. Stir up the jar and reach into the reach to take a ball without looking. This operation is called a 'trial' and the result of the trial is a 'diameter' $x$, a **random variable**. It turns out, no matter how randomly you choose the balls, there is a function that describes the probability to get a diameter $x$ at a single trial. This function is called a **probability distribution** $P(x)$.
+Imagine a jar full of metal balls with differing diameters. Stir up the jar and reach into it to take a ball without looking. This operation is called a 'trial' and the result of the trial is a 'diameter' $x$, a **random variable**. No matter how randomly you choose the balls, there is a function that describes the probability to get a diameter $x$ at a single trial. If the $x$ sample is taken from a population $Q$, then the function is $P_Q(x)$ called the **probability distribution** function because it gives the probability of finding the value $x$ at a single trial from a population $Q$.
+Probability distributions can be **continuous** or **discrete**. Continuous distributions allow all integer and fractional values, but the discrete distributions allow only a discrete set of values. For the continuous case, $P_Q(x)(dx)$ is the probability of the value of $x$ being between $x$ and $x+dx$. And in the discrete case, $P_Q(x_j)$ is the probability of the value being $x_j$ where $j=1,2,3,...$
+The probability distributions most used in astronomy are the discrete [[uv:poisson|Poisson distribution]] and the continuous [[uv:gaussian|Gaussian distribution]]. Read the linked //Universe// articles for more about them.
+Here instead let us compare the two. Poisson only allows non-negative integer values describing the number of events within a duration of time. For example, the number of raindrops falling on a tin-roof in one second, the number of photons falling on the detector of the Chandra X-ray telescope and so on. On the other hand, Gaussian distributions allow any positive or negative value and can describe the multiple measurements of any given quantity. For example, if you measure the magnitude of a star 100 times and get rid of all systematic errors, then the stochastic-error-dominated final results can be described using a Gaussian.
+Gaussian is symmetric with respect to the mean $\mu$ and the standard deviation $\sigma$ is completely independent of the mean. Poisson is not symmetric and its variance is exactly equal to its mean: $\sigma^2=\mu$.
+In case of Poisson, the fractional uncertainty $\sigma/\mu$ in measuring $N$ events is proportional to $N^{-1/2}$.
+The [[uv:mbd|Maxwell-Boltzmann distribution]] is another widely used function in physics and astronomy.
+If a distribution $P(x,\mu,\sigma)$ is known, its mean and variance can be calculated easily. For the continuous case
+$$ \mu = \frac{1}{N} \int_{-\infty}^{\infty} xPdx \text{ ; } \ \sigma^2 = \frac{1}{N} \int_{-\infty}^{\infty} (x-\mu)^2Pdx $$
+and for the discrete case
+$$ \mu = \frac{1}{N} \sum_{i=-\infty}^\infty x_i P(x_i) \text{ ; } \ \sigma^2 = \frac{1}{N} \sum_{i=-\infty}^\infty (x_i-\mu)^2 P(x_i). $$
 ===== - Estimation =====
+Estimation is the effort of determining the characteristics of a population based on some samples. The **central limit theorem** is the most important concept here, so try to focus on this first.
+Let us go back to the example of a jar full of metal spheres. The spherical balls are made of iron and there are 10,000 of them in an enormous jar. A Devi plots the true distribution of the iron spheres as the left panel of the following figure.
+[[https://colab.research.google.com/drive/1MdVr_DAp1ipE8jBvhH63c83IwSj6kqJF?usp=sharing|{{:courses:ast201:clt.png?nolink&850|}}]]
+As evident from the left panel, the distribution has two peaks at around 3 mm and 8 mm and it is not Gaussian as a whole even though the individual peaks look Gaussian. Now assign a demigod to reach into the jar and pick up 10 spheres, measure their diameters and calculate the mean. He then puts them back and again pick up 10 spheres blindly and repeat the procedure. This way he samples the population 1000 times and creates the right panel of the figure using the means of the 10 samples in each of the 1000 trials. Miraculously the distribution of the sample means look totally Gaussian. This miracle is called the central limit theorem.
 ===== - Propagation =====
+If an equation has multiple variables, the error associated with each variable will contribute toward the final error. So the errors have to be propagated from the right-hand side to the left-hand side of an equation. For example, differential photometry requires us to subtract one magnitude from another:
+$$ \Delta m = m_1 - m_2 $$
+where $m_1$ is the measured magnitude of a standard calibrator object and $m_2$ the magnitude of our designed unknown object. Now if $m_1$ has an error of $\sigma_1$ and $m_2$ has an error $\sigma_2$, then the final variance would be
+$$ \sigma^2 = \sigma_1^2 + \sigma_2^2 $$
+that is the variances do not get subtracted but add up. For both addition and subtraction, the errors will always add up. For multiplication and division, the rule is different. For example the ratio of the two corresponding fluxes
+$$ F = \frac{F_1}{F_2} $$
+will have an uncertainty related to the variance $\sigma^2$ as follows
+$$ \left(\frac{\sigma}{F}\right)^2 = \left(\frac{\sigma_1}{F_1}\right)^2 + \left(\frac{\sigma_2}{F_2}\right)^2 $$
+and again you see that whether there is a multiplication or division, the fractional errors always add up. The rule of error propagation can be generalized using the following way.
+If $G$ if a function of $n$ variables ($x_i$) and their standard deviations are given by $\sigma_i$, then the variance in $G$ is given by
+$$ \sigma^2 = \sum_{i=1}^n \left(\frac{\partial G}{\partial x_i}\right)^2 \sigma_i^2 + \mathcal{C} $$
+where $\mathcal{C}$ is the [[uv:covariance]] which we can consider zero for the current purpose. So we have to perform a partial differentiation of the function with respect to each variable $x_i$ and multiply the square of the result with the variance of that variable ($\sigma_i^2$). The rules of propagation for subtraction (addition) and division (multiplication) shown above can be derived from this.