Censoring in a study is when there is incomplete information about a study participant, observation or value of a measurement. In clinical trials, it’s when the event doesn’t happen while the subject is being monitored or because they drop out of the trial.
Right censoring (sometimes called point censoring) happens when the subject leaves the study before it’s finished (“loss-to-follow-up”) or when the event you’re interested in doesn’t happen during the course of the study (“end-of-study”).
For example, in a 13-week clinical trial for pain relief, as many as 35% of patients failed to complete the study because of side effects from the medication or lack of relief from the placebo (AMSTAT). In general, drop outs from trials are very common: a 2010 report by the National Academy of Sciences states that patient dropout rates can sometimes exceed 30%.
If the event of interest (i.e. death, cure or other event) doesn’t happen during the course of study, the event is censored and given an event time of (t,∞) where t is the time of the end of the study.
Left censoring is when the subject was at risk for the event being studied before the start of the study. It’s not very common for this to be a factor. If it does happen, it’s usually not an issue for clinical trials as the starting point of the trial may be the occurrence of a particular treatment or the development of a disease.
When an entire study group has already experienced the event of interest, it’s called right truncating. For example, you might study groups of individuals who are admitted to the hospital post-stroke. If patients in the study are all high-risk, but haven’t yet experienced the event, then it’s called left truncating. Life insurance policies are examples of left-truncation; people enter into a policy and have the event of “death” at some point in time. Truncation is always deliberate and part of a study design, where censoring is random.
Change of variable is a technique where, by a process of substitution, you can change the variables in an integral to new variables. Typically you would do this in an effort to simplify the problem, or make it easier to understand.
As an example, imagine you wanted to find the roots of a polynomial. You know all about how to solve quadratic polynomials—in fact, you’ve probably memorized a formula for it—but suppose this polynomial was something rather harder, say, a sixth degree polynomial. Suppose it was:
Sixth degree polynomials are not just hard to solve off the bat; often, they’re impossible. However, a change of variables can save the day. Let’s define a new variable, u = x3. Then you can write the equation as :
Of course, you don’t actually want to know values of u; you want x. You can get that by substituting back, and you find the real solutions to the equation are:
It’s not just solving polynomials where a technique like this comes in useful, though. Change of variable is also used in integration, differentiation, and coordinate transformations. When you are using it in Calculus, remember to change the variable every time it occurs to make a meaningful change. For differentiation, you could use the chain rule, for integration, you could use u substitution.
Penn State, Eberly College of Science. Stat 414/415: Probability Theory and Mathematical Statistics. Lesson 22: Functions of ONe Random Variable. Change of Variable Technique. Retrieved from https://newonlinecourses.science.psu.edu/stat414/node/157/ on August 20, 2019
The Clausen function (also called the Clausen Integral) is a transcendental, special function related to the dilogarithm of complex argument. It is widely used in experimental and higher-dimensional mathematics, and physics—especially in quantum theory. It’s usefulness stems in part because many indefinite integrals of trigonometric functions and logarithmic functions can be expressed in closed form with Clausen functions.
The Clausen function is intimately connected with various other functions including the polygamma function, Dirichlet eta function and the Riemann zeta function. The Lobachevsky function is basically the same function with a change of variable.
While this particular form is often called “the” Clausen function, other forms of the integral do exist. For example, Junesang (2016) formulated a new definite integral formula for by using a known relationship between the Clausen function and the generalized Zeta function.
Abramowitz, M. and Stegun, I. A. (Eds.). “Clausen’s Integral and Related Summations” §27.8 in Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing. New York: Dover, pp. 1005-1006, 1972.
The Cochran-Mantel-Haenszel (CMH) Test is a test of association for data from different sources, or from stratified data from one source. It is a generalization of the McNemar test, suitable for any experimental design including case control studies and prospective studies. While the McNemar can only handle pairs of data (i.e. a 2 x 2 contingency table), the CMH can handle analysis of multiple 2 x 2 x k tables from stratified samples. The results from the tables are weighted (i.e. given different levels of importance) according to the size of the sample in each strata. For pairs of data, the results from CMH and McNemar will be the same.
The CMH statistic is particularly useful in clinical trials, where confounding variables cause extra connections between the dependent variable and independent variable. To run the CMH test, the confounding variable is categorized across a series of 2 x 2 tables, each of which represents one aspect of the confounding variable. Each table represents a “clean” connection between the independent and dependent variable — without the confounding variable causing hidden associations. As the test is run on these individual tables and not one combined table, it avoids the spurious associations that happen when you try to collapse the individual tables together — a phenomenon called Simpson’s Paradox (Rao et. al, 2008).
It’s recommended that you use statistical software because the CMH statistic is tedious to calculate by hand; It’s not uncommon to run this test on large numbers of table (over 30 is common), so the calculations can become quite lengthy. In addition, the test is made a little more complicated by the fact that there are different versions of the test. For example, (DiMaggio, 2012) SAS has three versions, Types 1, 2 and 3:
Best Statistics SiteThe null hypothesis for the CMH test is that the odds ratio (OR) is equal to one. An odds ratio of exactly 1 means that exposure to property A does not affect the odds of property B. If you get a significant result in this test (i.e. if your test rejects the null hypothesis), then you can conclude there is an association between A and B.
A Cohort study, used in the medical fields and social sciences, is an observational study used to estimate how often disease or life events happen in a certain population. “Life events” might include: incidence rate, relative risk or absolute risk.
The study usually has two groups: exposed and not exposed. If the exposure is rare (for example, exposure to an industrial solvent), then the cohort is called a “special exposure cohort.” Both groups are followed to see who develops a disease and who does not. For example, you could look at cigarette smokers to see who gets breast cancer and who does not. The study would include a group of smokers, and a group of non-smokers.
Retrospective (Historical): the researcher looks at historical data for a group. Some of the people in this group have developed the disease, and some have not. This can result in finding out who has the disease and when they developed it.
Case-control nested within a cohort: a smaller group is chosen from within the cohort for a deeper look. These investigations may include genotyping, collecting tissue samples or other factors.
Case-cohort: similar to case-control nested within a cohort. The difference is that in a case-cohort study, participants are evaluated for outcome risk factors at any time before the first outcome (i.e. the first incidence of disease).
A prospective cohort study takes a group of similar people (a cohort) and studies them over time. At the time the baseline data is collected, none of the people in the study have the condition of interest. This is in contrast to a retrospective cohort study, which takes a group of people who already have the condition and then attempts to piece together the reasons why. The now famous Framingham Heart Study is one example of a prospective cohort study; the researchers have, to date, studied three generations of Framingham residents in order to understand the causes of heart disease and stroke.
Although none of the participants actually have the disease of interest in a prospective cohort study, some of the cohort are expected to develop the disease in the future. For example, a cohort of thirty-year-old people in a certain town might be studied to see who develops lung cancer. Half of the cohort might be smokers and half may not. This enables comparisons between the two groups.
One major advantage of a prospective cohort study is that researchers don’t have to tackle with the ethical issues surrounding randomized control trials (i.e. who receives a placebo and who gets the actual treatment).
A retrospective cohort study (also known as a historic study or longitudinal study) is a study where the participants already have a known disease or outcome. The study looks back into the past to try to determine why the participants have the disease or outcome and when they may have been exposed. In a retrospective cohort study the researcher:
One of the first recognized retrospective cohort studies was Lane-Claypon’s 1926 study of breast cancer risk factors, titled “A Further Report on Cancer of the Breast, With Special Reference to Its Associated Antecedent Conditions.” The study of 500 hospitalized patients and 500 controls led to the identification of most of the risk factors for breast cancer that we know today.
In a retrospective cohort study, the group of interest already has the disease/outcome. In a prospective cohort study, the group does not have the disease/outcome, although some participants usually have high risk factors.
Retrospective example: a group of 100 people with AIDS might be asked about their lifestyle choices and medical history in order to study the origins of the disease. A Second group of 100 people without AIDS are also studied and the two groups are compared.
Prospective example: a group of 100 people with high risk factors for AIDS are followed for 20 years to see if they develop the disease. A control group of 100 people who have low risk factors are also followed for comparison.
A cohort effect is the influence of a group’s life experience on the outcome of an experiment. It’s the effect of being born at the same time (i.e. GenXer vs. Baby Boomer), or in the same region (i.e. born in New Orleans vs. Seattle) or some other factor that makes the group unique. Cohorts in schools are usually defined by age group, while cohorts in organizations are defined by their date of entry into the job.
Lets say you were conducting cross sectional research (a method that compares different age groups at the same point in time) to find out how basic mathematics ability improves with age. You give the same basic math standardized test to groups of students who are 7-years-old, 14-years-old, and 21-years-old. You get the following mean results: