# π Fitting Experimental Data in Python

```
## Background
```

As part of my undergraduate studies I recently had to write a formal report on the exciting topic of mechanical resonance after having collected data from the transient and steady state responses of a oscillating torsional pendulum.

Rather problematically, we have yet to be taught any meaningful data processing outside of basic Excel regression, which proved a bit of a hindrance in establishing the quality factor of the oscillator.

Supposedly this could be done by taking the ratio of the amplitude at resonance and the amplitude for an angular frequency of 0 radians per second. The amplitude at resonance would then have to be guestimated from the graph in Figure 1. Thatβs hardly very scientific and in my case introduced such large errors that the estimated quality factor was effectively useless with an uncertainty north of 20%.

## A better way

A better way of going about this would be to establish the resonant frequency $Ο_0$ and the damping coefficient $Ξ³$ by fitting the data to the following model:

$$ X(Ο) = \frac{f}{\sqrt{\qty(Ο^2_0 - Ο^2)^2} + 4Ξ³^2 Ο^2} $$

The quality factor could then be found without resorting to questionable approximations by using the definition of the quality factor:

$$ Q = \frac{Ο_0}{2Ξ³} $$

Performing a custom regression analysis is however not a trivial task given the set of tools made available to a first year Natural Sciences student in the UK.

## Python to the rescue

Luckily Python is an excellent language for all things science and a range of packages give easy access to powerful data analysis tools.

In order to perform the custom fit I used the following Python packages:

- lmfit
- numpy
- pandas
- sympy
- matplotlib
- uncertainties

### Getting the Excel data into pandas

The first challenge is to get the data from the Excel file into a `pandas.DataFrame`

. This is code rather crudely by using the `pandas.read_excel`

function and simply selecting the data for the two different degrees of damping examined.

```
# Initialisation
import numpy as np
from lmfit import Model
import pandas as pd
import sympy as sym
from matplotlib import pyplot as plt
from uncertainties import ufloat
XLSX_LOC = 'E5data.xlsx'
XLSX_SHEET_NAME = 'E5-forced'
```

```
raw_df = pd.read_excel(XLSX_LOC, sheet_name=XLSX_SHEET_NAME)
df = pd.DataFrame(columns=['X', 'omega', 'I_braking'])
# Get 0.3A data
df1_raw = raw_df.iloc[7:22]
df1 = pd.DataFrame()
df1['X'] = df1_raw['Unnamed: 6'].astype(np.float64)
df1['omega'] = df1_raw['Unnamed: 3'].astype(np.float64)
df1['I_braking'] = 0.3
# Get 0.6A data
df2_raw = raw_df.iloc[26:34]
df2 = pd.DataFrame()
df2['X'] = df2_raw['Unnamed: 6'].astype(np.float64)
df2['omega'] = df2_raw['Unnamed: 3'].astype(np.float64)
df2['I_braking'] = 0.6
# Reindex
df = pd.concat([df, df1, df2])
df = df.reset_index(drop=True).reset_index().set_index(['I_braking'])
df.head()
```

index | X | omega | |
---|---|---|---|

I_braking | |||

0.3 | 0 | 0.45 | 5.317223 |

0.3 | 1 | 0.85 | 4.472018 |

0.3 | 2 | 2.90 | 3.690564 |

0.3 | 3 | 4.00 | 3.580163 |

0.3 | 4 | 4.50 | 3.544815 |

Which outputs the DataFrame as shown in Figure 2.

### Defining a function to fit to

Now we need to use lmfit to define a Model to which we can fit the data. This is done by defining a function corresponding to Equation 2.

If we try to fit the function to the data straight away we run into problems β we need to provide a reasonable initial guess of the different parameters. This could probably be done by simple intelligent guess-work, but we could also solve 3 equations with 3 unknowns using the `sympy`

package and by defining meaningful bounds for our parameters.

```
# Setup lmfit model
def fit_func(omega_, f_, omega0_, gamma_):
return f_ / (
np.sqrt((omega0_**2 - omega_**2) ** 2 + 4 * (gamma_**2) * (omega_**2))
)
model = Model(fit_func)
# Setup SymPy
omega, f, omega0, gamma, X = sym.symbols("omega,f,omega0,gamma,X")
eq = sym.Eq(
f / (sym.sqrt((omega0**2 - omega**2) ** 2 + 4 * (gamma**2) * (omega**2))), X
)
```

### Performing the fit

We are now ready to perform the fit. We need to do it twice β once for each degree of damping ($I_{braking} = 0.3A$ and $I_{braking} = 0.6A$). We store the results of our regression analysis in the a `dict`

called `results`

```
results = {}
for damp_deg in [0.3, 0.6]:
# Set up SymPy equations
eqs = []
for i in [0, 2, 4]:
eqs.append(
eq.subs(
[
(X, df.loc[damp_deg]["X"].iloc[i]),
(omega, df.loc[damp_deg]["omega"].iloc[i]),
]
)
)
sol = sym.nsolve(eqs, (omega0, gamma, f), (3, 1, 10), dict=True, verify=False)
params = model.make_params(
omega0_=np.float64(sol[0][omega0]),
f_=np.float64(sol[0][f]),
gamma_=np.float64(sol[0][gamma]),
)
params["omega0_"].min = 2
params["omega0_"].max = 5
params["f_"].min = 0
params["gamma_"].min = 0
result = model.fit(
df.loc[damp_deg]["X"], params, omega_=df.loc[damp_deg]["omega"], method="Nelder"
)
results[damp_deg] = result
```

### Plotting the data and fit

Finally we are ready to plot the data and accompanying fits using `matplotlib`

. This is fairly straight forward for anyone used to `matplotlib`

or `MATLAB`

. By using lmfit we get easy access to the standard errors, which we can find by doing `ModelResult.eval_uncertainty()`

. We can also calculate the $RΒ²$ value from the residuals from the fit.

`ufloat`

from the `uncertainties`

package is used to easily propagate the standard errors from the $Οβ$ and $Ξ³$ estimates onto the quality factor estimate (see Equation 2).

### The result

`matplotlib`

gives us this lovely figure as a reward for our hard work.

```
colors = {0.3: "k", 0.6: "r"}
NUM_FIT_POINTS = 500
fig, ax = plt.subplots(figsize=(10, 6))
for deg_damp, result in results.items():
x_vals = df.loc[deg_damp]["omega"]
y_vals = df.loc[deg_damp]["X"]
x_vals_dense = np.linspace(min(x_vals), max(x_vals), NUM_FIT_POINTS)
r_squared = 1 - result.residual.var() / np.var(y_vals)
uomega0 = ufloat(result.params["omega0_"].value, result.params["omega0_"].stderr)
ugamma = ufloat(result.params["gamma_"].value, result.params["gamma_"].stderr)
q_factor = uomega0 / (2 * ugamma)
plt.scatter(
x_vals, y_vals, None, colors[deg_damp], "+", label=f"Data, ${deg_damp}A$"
)
plt.errorbar(
x_vals,
y_vals,
xerr=None,
yerr=list(result.eval_uncertainty()),
ecolor=colors[deg_damp],
fmt="none",
)
label = (
f"Fit, ${deg_damp}A$\n"
f"$Ο_0={uomega0:P}$\n"
f"$Q={q_factor:P}$\n"
f"$R^2={r_squared:.3f}$"
)
plt.plot(
x_vals_dense,
result.model.eval(result.params, omega_=x_vals_dense),
colors[deg_damp] + "--",
label=label,
)
ax.set_xlim(2.8, 4.0)
ax.set_ylabel("$X$ / arbitrary units")
ax.set_xlabel("$Ο_0$ / $rad s^{-1}$")
plt.legend()
```

## Conclusion

Using the method suggested by the Lab Manual I got a quality factor value of $Q = 5.6$ with standard error of $Ο = 1.2$. By fitting the data to the theoretical model we instead get a value of $Q = 5.57$ with a standard error of $Ο=0.15$, from which actual conclusions can be drawn.

I am assured University of Cambridge will join the 21st century any day now, but until then we will continue to perform non-linear best-fits by hand and creative imagination.

The Jupyter notebook containing the code can be found on my GitHub along with the raw Excel file: GitHub/JeppeKlitgaard.