User Tools

Site Tools


sum_of_individual_weighted_means

Sum of individual weighted means

This method is described in detail in a paper by Terrenato and Ricci1).

This is useful for calculating a correct frequency in time-based series, because archaeological classes of objects can have long-time spans. The simplistic use of the central date of the timespan can give very different results when you want to obtain a time distribution for a class of artefacts.

Ancient shipwrecks of the Mediterranean

The method has been used to obtain significative distributions of frequency for data coming from the catalogue Ancient Shipwrecks of the Mediterranean by A.J. Parker. The author gives some distribution graph, but he's using central dates from ceramic assemblages rather than a weighted mean of each shipwreck's time range2).

Comparison between the two methodsComparison between the “simple” method using central date and the more sophisticated one based on the sum of individual weighted means. The difference is remarkable.

Suppose you have a file formatted like this, named data, in your working directory. Start R from the command line in that directory (it's important you always execute the program in the same directory to have your data and history).

EndDate,StartDate
245,399
250,349
...

Then you can execute this code in R

## import the dataframe
data <- read.csv(file="data", header=TRUE)
 
ends <- data$EndDate
starts <- data$StartDate
totalLength <- length(ends)
# this parameter can be set to any value you wish
# you can also define it with respect to the range of your data, based on other values
step <- 10
 
# this part is optional and can be reversed to have a full range
endt <- max(starts) - step/2
startt <- min(ends) + step/2
 
sequence <- seq(startt,endt,by=step)
 
years <- data.frame(year=sequence, value=seq(0,0,length=length(sequence)))
yearsLength <- length(years$year)
 
for ( i in 1:yearsLength )
        for ( j in 1:totalLength )
                if ( years$year[i] < ends[j] && years$year[i] > starts[j] )
                       years$value[i] <- years$value[i] + step/(ends[j] - starts[j])

If you are likely to edit often the code, tweaking the parameters or enhancing it, you might want to save it in a file, named i.e. SumIWM.R (be sure the file is found in the same directory you are working), and then you can execute it as a whole with the command

source("SumIWM.R")

in the R console. This way you can run the script, see the results, maybe tweak some settings, and run it again to see the differences. This code is just an example and it comes under the same license as the other documentation, the GNU Free Documentation License.

You can obtain a vector of central dates simply with

CentralDate <- (ends + starts)/2

This should work even with BC dates, as long as you save them as in -480 (minus year).

1) Nicola TERRENATO and Giovanni RICCI, I residui nella stratificazione urbana. Metodi di quantificazione e implicazioni per l'interpretazione delle sequenze: un caso di studio dalle pendici settentrionali del Palatino, in “I materiali residui dello scavo archeologico”, edited by Federico GUIDOBALDI, Carlo PAVOLINI and Philippe PERGOLA, Roma, École Française de Rome, 1998.
2) This technique has first been applied to Parker's catalogue in Enrico ZANINI, Ricontando la terra sigillata africana, in «Archeologia Medievale» 23, 1996, p.677.
sum_of_individual_weighted_means.txt · Last modified: 2012/11/30 09:59 (external edit)