January 09, 2025, 02:35:45 PM
Forum Rules: Read This Before Posting


Topic: Statistical analysis of IR data  (Read 5007 times)

0 Members and 1 Guest are viewing this topic.

Offline dipesh747

  • Regular Member
  • ***
  • Posts: 89
  • Mole Snacks: +7/-7
Statistical analysis of IR data
« on: March 12, 2012, 11:18:33 AM »
Hi, this probably isn't the right forum to be posting on but I am completely stuck and don't know where else to post lol

So I have been working on a project all year and it is to do with hair and how it is effected by bleaching, atmosphere etc. And I would like to talk about the natural variation within each hair sample.

I have IR data for a number of hair samples, and the data I would like talk about it the peak height of Cysteic acid, Amide III and Amide I.

So I have all the peak height data in an excel spread sheet
I have calculated standard deviations and means

Now all I want is one of those bell-graphs so I can show how far each data point is from the mean. I'm really confused as to how to do this, I can't work out what to put on each axis. Everything I have read online seem's to use completely random numbers.

Any help would be much appreciated ( or even a link to a good maths forum where I could ask the same questions)

Thanks

Offline Arkcon

  • Retired Staff
  • Sr. Member
  • *
  • Posts: 7367
  • Mole Snacks: +533/-147
Re: Statistical analysis of IR data
« Reply #1 on: March 12, 2012, 11:32:52 AM »
Kinda hard to understand the question ... if your samples are a normal (or Gaussian) distribution, then the individual points should from a bell curve when plotted, and the mean should appear.  Did you try to build a scatter plot and see how it looks.
Hey, I'm not judging.  I just like to shoot straight.  I'm a man of science.

Offline dipesh747

  • Regular Member
  • ***
  • Posts: 89
  • Mole Snacks: +7/-7
Re: Statistical analysis of IR data
« Reply #2 on: March 12, 2012, 11:56:24 AM »
Ok I will try and explain again.

So lets say I have 20 hair samples, each sample has 6 IR spectra. For each spectra I have written down the peak height of a certain peak (call it peak A)

So for the 20 hair samples I have 120 peak heights for peak A.

So for these 120 absorptions I would like to show how large the natural variation is within hair.

I don't really know what to do, I have calculated a mean and standard deviation for the data set and I have worked out how many standard deviations each data point is from the mean.

I don't actually know what distribution my samples are.

I can do a scatter plot of absorption vs distance of IR spectra from root. But it doesn't show any probability curves it just shows this



Sorry I'm a bit useless at stats, been a long time since I have done stats.

Offline Arkcon

  • Retired Staff
  • Sr. Member
  • *
  • Posts: 7367
  • Mole Snacks: +533/-147
Re: Statistical analysis of IR data
« Reply #3 on: March 12, 2012, 12:06:28 PM »
The chart makes it more clear.  You have multiple observations for each data point, and would like to express their variation, not the population's variation.  Maybe what you want are error bars?  Does Excel's help make it easy for you to find out how to do that?  Because I really don't remember. 

Even better, does anyone know, if Excel can fit a regression line to the max and min error bars?  Because that would be really useful.  Hmmm ... maybe you want to try and find another, more science oriented spreadsheet / charting program.
Hey, I'm not judging.  I just like to shoot straight.  I'm a man of science.

Offline dipesh747

  • Regular Member
  • ***
  • Posts: 89
  • Mole Snacks: +7/-7
Re: Statistical analysis of IR data
« Reply #4 on: March 12, 2012, 12:28:06 PM »
I do in fact want too measure population variation.

With the graph I showed you, y axis is absorption and the x axis is the distance from the root the IR spectrum was taken from. Although there appears to be greater variation in absorption in some distances from root compared to other; this is completely down to randomness and due to the fact that hair is heterogeneous.

I would like to look at the data set as a whole and show how large, and how significant variation can be. So with the normal distribution for example, it is clear by a graph like that that 68% of sample are within one standard deviation, 95% within two etc.

I would like a graphical representation like that for my data.

I can get hold of software other than excel, but I don't know how I am going to express the data yet.

Have you got any ideas of the right direction I could go?


Offline dipesh747

  • Regular Member
  • ***
  • Posts: 89
  • Mole Snacks: +7/-7
Re: Statistical analysis of IR data
« Reply #5 on: March 12, 2012, 12:58:03 PM »
Ok so what I have done now is to work out the standard deviation for the set of data. I have then calculated how many standard deviations each point is from the mean and what it has shown me is:

37% of data within half a standard deviation
69.4% of data within one deviation
97.2% within 2 deviations
100% within 3

I have also worked out coefficient of varition (SD/mean), which is 20%. So I think this 20% value shows that the data set is quite random.

Offline fledarmus

  • Chemist
  • Sr. Member
  • *
  • Posts: 1675
  • Mole Snacks: +203/-28
Re: Statistical analysis of IR data
« Reply #6 on: March 12, 2012, 04:07:28 PM »
It looks like what you are trying to do is determine the distribution of measurements of a single value in a population, equivalent to the height of the population, or weight of a population. In this case, you are using the absorption of a specific IR wavelength. This gives you a probability density function - the probability that the value of the absorption of any piece of hair in the population will fall between certain values.

Given the mean and the variance of the peak, you can calculate the normal distribution - http://en.wikipedia.org/wiki/Normal_distribution will show you how. Two other calculations can help you refine your measurements - the skewness and the kurtosis. Skewness tells you whether the peak is symmetrical, and kurtosis tells you how "peaked" the peak is - how tall the peak is and how short the tails are.

However, you don't have 120 independent measurements; you have twenty samples, each measured 6 times. How do you account for differences in those measurements? Are you assuming that the hairs are different in different portions, or does this reflect the errors in processing and measurement? For example, did you try running the same sample at different times during the course of the experiment? How does measurement-to-measurement variability differ from sample-to-sample variability from a single source, to source-to-source variability? This gets a lot deeper into statistics than I can comfortably wander, and far deeper than I could try to communicate here.

Offline dipesh747

  • Regular Member
  • ***
  • Posts: 89
  • Mole Snacks: +7/-7
Re: Statistical analysis of IR data
« Reply #7 on: March 12, 2012, 05:10:26 PM »
Ok. I have looked on that wiki page, and I cannot understand how they have created their bell graphs; specifically I don't understand what they have used on the y axis. (x axis is the standard deviation from the mean, I have this data tabulated)

Yes I have got 20 samples, each sample had 6 IR spectra (each spectra was an average of 32 scans), the 6 IR spectra corresponded to different points along the hair from the root i.e 0cm (root), 4cm, 8cm, 12cm, 16cm and 20cm.

And yes all hair spectra are different - however this is not surprising!
« Last Edit: March 12, 2012, 05:30:29 PM by dipesh747 »

Offline fledarmus

  • Chemist
  • Sr. Member
  • *
  • Posts: 1675
  • Mole Snacks: +203/-28
Re: Statistical analysis of IR data
« Reply #8 on: March 13, 2012, 08:34:13 AM »
The y axis is just a proportion of the population which has that value of x. It ranges from 0 to 1. If you are measuring invariant data (for example, the length of a population of 12" rulers), you should end up with a single tall peak that goes all the way to 1. The total area under the curve (integral) should be a constant representing the total population, so as the width of the curve (represented by your variance) increases, the height decreases.

Thank you for your description of your experimental method. It shows that there are a couple of other dependencies you could calculate. For example, is any of the variability in the IR spectra systematically dependent on the distance from the root to the tip of the portion of hair used? And is any of the variability systemically dependent on whether the two samples were from different portions of the same hair, or were from different hairs? This will require some multivariate analysis of your results.

Offline dipesh747

  • Regular Member
  • ***
  • Posts: 89
  • Mole Snacks: +7/-7
Re: Statistical analysis of IR data
« Reply #9 on: March 14, 2012, 01:19:31 PM »
As I am measuring peak heights and they are accurate to within 4dp this means the standard deviations from the mean are accurate to 4sf, so it is unlikely for any two data points to be the same. So when i did the scatter it just looked like a straight line. Is there a way to fix this?

I don't fully understand what you mean by your other comments, do you mean does the IR spectra change depending on where on the portion of hair has a spectra?

Offline dipesh747

  • Regular Member
  • ***
  • Posts: 89
  • Mole Snacks: +7/-7
Re: Statistical analysis of IR data
« Reply #10 on: March 17, 2012, 10:04:47 AM »
Hello.

I used the excel function =norm.dist() and I obtained the bell shape curve I have been looking for. However the y axis goes upto 30 and not 1 (like I have seen in all my reading around this topic)

Can anyone explain this result?

Is this not the right excel function to be using?


Sponsored Links