Introduction

The documentation and interpretation of settlement plans is and has always been a major task for archaeologists. Intra- and cross-cultural comparisons are mostly done through detailed qualitative or semi-quantitative descriptions and typologies. It is furthermore generally assumed that certain characteristics of a settlement plan’s layout will reflect aspects of the social and economic organisation of its society, depending on a range of culture-specific factors (e.g. Eglash, 1999; Ensor, 2013; Fraser, 1968; Furholt, 2016). A strong grid-like regularity of a settlement layout may reflect the presence of overall planning, as exemplified through imperial Roman cities and military camps. Conversely, irregular or fractal-like settlement layouts should be expected to reflect more autonomous or self-organised builders (see Batty, 2005; Batty & Longley, 1994; D’Acci, 2019; Lagarias & Prastacos, 2021; Tannier & Pumain, 2005, for theoretical discussions within Urban Science). Irregularity levels on settlement plans could thus potentially serve as a proxy for the level of planning, and furthermore be indicative of the type and scale of social organisation at hand. Interpreting socio-political systems from settlement plans alone would probably be over-optimistic, as spatial regularity can result from shared cultural norms of e.g. house placement and orientation, without interference from any top-down architect. However, fractal patterns are only exceptionally known to result from conscious human planning (and mostly from the twentieth century onwards), and regular grids do not emerge from random processes, so a quantifiable irregularity index would in any case be useful as a basis for further social interpretation. But in the absence of adequate reproducible quantitative methods fit for this purpose, studies that have aimed in this direction within Archaeology have so far been largely dependent on qualitative description. The arguably best options have been approaches based on multivariate statistics methods like correspondence analysis (e.g. Furholt, 2016), which, while performing reasonably well, may suffer from the issue of reproducibility.

The term fractal, from latin fractus meaning broken or irregular, was pinned by the French-American mathematician Benoît Mandelbrot (1975; popularised in English in 1982), and was loosely defined as a set exhibiting the properties of self-similarity — the same pattern is repeated over and over — and scale invariance — the pattern looks similar no matter the scale of observation. The term also alludes to the fact that such patterns are characterised by dimension values that are fractions rather than integers like in Euclidean geometry (fractals with integer dimension values do exist, but are special cases). Mandelbrot elegantly showed how fractals could convincingly model a range of natural and social phenomena, from the mapping of craters on the Moon and the distribution of galaxies, to road networks and corporate hierarchies. Fractal analysis of spatial patterns is often centred on the estimation of the pattern’s fractal dimension (denoted D, see Methods section for details; in Mandelbrot’s works also referred to as the Hausdorff–Besicovitch dimension). This is a measure of the degree to which a pattern fills its embedding space, and — in contrast to simpler density measures — it is measured across a range of different scales. In theoretical fractal patterns, the scales can range from the infinitely small to the infinitely large, but when working with empirical data, there will always be a lower and an upper bound to meaningful scales. Furthermore, empirical fractals are often referred to as self-affine rather than self-similar when the smaller copies of the pattern are stretched or distorted versions of the whole. When elements of randomness are added to the construction of a fractal, one can talk of statistical self-similarity, often resulting in surprisingly natural-looking shapes (see Mandelbrot, 1982: 200–4 and 236 for details). Methods for estimating the fractal dimension of empirical patterns have continued to be developed since (see Methods section), but in general terms, any pattern consisting of dots or segments of a straight (i.e. one-dimensional) line, has a fractal dimension between 0 and 1. Furthermore, for a pattern that stretches across a two-dimensional plane without filling it entirely, like a settlement plan, 1 < D < 2, and for a pattern embedded in three-dimensional space, like a topographic landscape, has a fractal dimension between 2 and 3. The higher the value, the more fully the pattern fills its embedding Euclidean space.

While this framework allows for quantification of even the most irregular patterns, a significant drawback recognised by Mandelbrot, is that by itself it is far from being a full characterisation. Just as a specific mean value can characterise very different looking distributions, a fractal dimension estimate can be the same for very different looking patterns. This is why he also proposed a complementary measure, lacunarity, from the latin lacuna meaning lake or gap. It quantifies the regularity in distribution of gaps between elements, also across scales. Together, the fractal dimension and lacunarity of a pattern provide a good quantitative description of its texture. For two-dimensional patterns, both analyses are done on binary images with pixels being either occupied (pattern) or non-occupied (background).

In recent years, this fractal image analysis has increasingly been used within the field of Urban Science for the purpose of quantifying and comparing different urban textures, providing an empirical basis for interpretations on functional and social aspects — like the interplay between modern city centres and suburbs, socioeconomic disparities, housing price and density gradients from the centre, efficiency of urban transport networks, access to green areas, as well as various growth processes over time (for a recent review, see Jahanmiri & Parker, 2022). This development has been greatly facilitated by advances and accessibility of computing software, and the recent advent of open-source programming has aided in the transparency of the applied methods and increased the comparability of different studies. Within Archaeology, application of fractal analysis for the quantification and study — including through simulation — of settlement plans, was proposed already in the seminal paper by Zubrow, (1985), and has been sporadically attempted since, mainly by Mesoamerican specialists (Brown & Witschey, 2003; Farías-Pelayo, 2017; Oleschko et al., 2000). Apart from settlement plans, the analysed patterns can be site topography (Farías-Pelayo, 2017), artworks (Brown et al., 2005; Lara Galicia, 2013), artifact outlines (Farías-Pelayo, 2017; Zubrow, 1985), or use-wear surface textures (Rees et al., 1988; Stemp, 2014). Fractal analysis is furthermore not limited to the study of images and patterns embedded in geographical space. As a general methodological toolkit, it has potential for application within a wide variety of topics, from lithic technology to demography, social networks, diffusion and migration (for overviews, see Brown et al., 2005; Diachenko, 2018. For a more detailed introduction, see Brown & Liebovitch, 2010). Fractals are moreover intimately related to power-law and Zipf’s law distributions and derived approaches (e.g. Crabtree et al., 2017; Grove, 2011; Maschner & Bentley, 2003; Strawinska-Zanko et al., 2018), including Settlement Scaling Theory (Gomez-Lievano et al., 2012; Lobo et al., 2020; Smith et al., 2021), with much overlapping theoretical framework (see Mandelbrot, 1982: 341–8). Many other applications that have not yet been tested in Archaeology are also conceivable. The systematic exploration of this potential is, however, long overdue. Most studies, including this one, only scratch the surface of what is possible, and the groundwork necessary for the generalisation of the methods has yet to be done.

In the present study, I provide a first attempt of a systematic exploration of how different settlement related variables — layout, size, density and house-size distribution — as well as varying degrees of added random spatial noise, influence estimates of fractal dimension and lacunarity on computer simulated settlement plans. Results are furthermore compared to those from a set of analysed archaeological settlement plans, sampled from two late Prehistoric study areas: Linear Pottery settlements in the Žitava valley in south-west Slovakia, and Trypillia settlements in the Southern Bug—Dnieper interfluve in central Ukraine. The overall goal is to assess the applicability of fractal dimension and lacunarity estimates as measures of spatial regularity/irregularity in settlements, and thus as quantitative proxies for household autonomy versus urban planning, as well as for quantitative characterisation of settlement plans within and across culture-historical settings. Though this study represents only a first step towards a formal generalisation of fractal analysis methods for this purpose, all analysis here is done through an R code script which is freely available for consultation and further development (see Data and Code Availability statement below), hopefully facilitating further research.

Material and Methods

When sampling settlement plans for the present study, I wished to factor out the almost omnipresent issue of missing data, since the question dealt with here — the relationships between fractal dimension, lacunarity and various settlement layout parameters — is still within quite uncharted territories. Geomagnetic imagery has undergone a rapid development in recent years, making it now possible to map entire settlements effectively where no architectural remains are preserved on the surface (e.g. Rassmann et al., 2014, 2016). Furthermore, sampling settlement plans from different culture-historical groups that have different settlement characteristics facilitates evaluation of the effects of these characteristics on the fractal dimension estimates. An underlying assumption of this analysis is that the following parameters are relevant for comparison with fractal dimension and lacunarity: basic layout, element count (i.e. number of buildings), element size distribution (house sizes), and image density (fraction of black pixels to total pixel count). The selection of the Žitava Valley Linear Pottery settlements in south-west Slovakia and the central Ukrainian Trypillia B2/C1 settlements (including two so-called mega-sites), was based on their large variation within these parameters, while at the same time representing comparable architectural traditions within the technical possibilities of wood-post structures, and both being extensively documented through recent geomagnetic surveys with very little missing data (Hale, 2020; Müller-Scheeßel et al., 2020; Ohlrau, 2020; for general overviews of the sites, see Furholt et al., 2020; Gaydarska, 2020; Müller, Rassmann & Videiko, 2016; Ohlrau, 2020). The Linear Pottery and Cucuteni-Trypillia cultural complexes are also, arguably, situated on opposite ends of the spectrum of social complexity and hierarchy within the European Neolithic/Chalcolithic, and the social organisation of both have been vividly debated (e.g. Coudart, 2015 and Hofmann et al., 2019 for discussions specifically related to architecture). Contributing to these debates is, however, not the primary focus of this paper.

The binary plan images used for the actual analysis were prepared from settlement shapefiles that were generously shared from the GIS analysts of the different projects that generated the data (Hale, 2020; Müller-Scheeßel et al., 2020; Ohlrau, 2020). These had been drawn by the different authors based on their expert interpretations of the geomagnetic images and on targeted excavations. The shapefiles have thus been drawn by at least three different individuals, but on the other hand this ensures that the data used for the present analysis is consistent with the extant spatial analyses made by the different projects. Interpreting the presence and size of a house from anomalies in a geomagnetic image is not done exactly the same way in Linear Pottery and Trypillia contexts. The former house type is generally a longhouse of various dimensions, limited on the long sides by extraction pits that provided material for the wall daub upon construction. These longitudinal pits are what is easiest to spot from Linear Pottery house remains on geomagnetic scans, and their location and length provide approximate outlines of the associated houses (Winkelmann et al., 2020: 41–45). Trypillia houses are not as identifiable from extraction pits, even though they also had wattle-and-daub walls, but the very recurrent practice of house destruction by burning at the end of their use-life made these houses even more recognisable on geomagnetic images. Houses are in these cases outlined by the raised clay platform and floor which collapsed on the spot during destruction, though a few cases of walls collapsing outwards — and thus disturbing the reading of the house outline and size — have also been documented through excavation (Chernovol, 2012: 183–4). A further challenge with Trypillia settlement plans is that not all houses were burnt, and some are subject to erosion — about 21% (610 of 2932) of the surveyed houses at Maidanetske (Ohlrau, 2020: 61–4) and 26% (368 of 1445) at Nebelivka (Hale, 2020: 124–9) were recorded as unburnt, eroded or probable structures — but these can often be interpreted, albeit with some less certainty, from weaker features. Recent studies have increasingly come to question the labelling of these houses as “unburnt” or “eroded”, since excavations tend to yield highly burnt materials also here, though in lesser quantities, indicating that these houses may simply be constructed with less daub or lacking a raised clay platform, possibly indicating a different function (see Pickartz et al., 2019: 1649–50 for further discussion on this point). For both the Linear Pottery and the Trypillia settlements that were included in the present study, there is little to no evidence of overlap between houses, suggesting that inhabitants to a large extent avoided constructing new houses on slots where there had already been one, for some time after abandon of the previous house. All features interpreted as building structures, no matter the type, were included in the images prepared for this analysis. For the Linear Pottery culture, there are commonly used typologies of houses based on size and number of internal modules (Coudart, 1998; Modderman et al., 1970: 100–120), while especially Trypillia mega-sites are known to be organised around so-called mega-structures (e.g. Hofmann et al., 2019; Müller, Hofmann & Ohlrau, 2016) or assembly houses (e.g. Chapman et al., 2016; Hale, 2020: 133–8). In this paper, no categorical distinction is made between such house types, since the goal of the analysis is to quantify the overall layout and noise in the settlement plan. No features other than buildings were included in the analysis. The images analysed in this paper were created in QGIS 3.20.1, as boolean (black-and-white) raster layers with 0.5 m resolution, where polygons were rendered with no stroke and black fill, cropped to the minimal extent of the vector layer (Figs. 1, 2 and 3).

Fig. 1
figure 1

Individual settlement plans analysed for this paper. Plans a-c, e and f represent Linear Pottery settlements in the Žitava valley of south-west Slovakia, while d and g-i represent Trypillia settlements in the Southern Bug—Dnieper interfluve in central Ukraine. Plans a and h show neighbourhoods (a) and quarters (b) that were also analysed separately. All lines, letters and scales are for illustration only and were not included in the analyses. All images were generated in QGIS 3.20.1 with 0.5 m resolution. After Müller-Scheeßel et al., (2020; a-c, e and f), Ohlrau (2020; d, g and i) and Hale, (2020; h)

Fig. 2
figure 2

Temporal development of Linear Pottery settlement at Vráble (south-west Slovakia), with 16 separate time samples at 20 years intervals (years cal. BCE). For each time sample, only houses that were already constructed but not yet abandoned (in black) were analysed. Grey shaded houses show the total settlement plan for illustration only. For the analysis, image frames were adjusted for each time sample. GIS data and house construction model after Müller-Scheeßel et al., (2020); house duration model after Meadows et al., (2019)

Fig. 3
figure 3

© D. Hale under the ADS Terms of Use and Access

The preparation process for the images analysed in this study, from raw geomagnetic (left), to archaeologically interpreted (centre) and binary with buildings only (right). The first two steps were done within the cited research projects (see Fig. 1), and the last step by the author based on the shared shapefiles. The frame of the resulting binary images was reduced to minimal height and width. The figure shows this process for the Nebelivka site plan, with left and centre images adapted from Chapman et al., (2018),

For two settlements — the Linear Pottery site of Vráble and the Trypillia site of Nebelivka — assignment of houses into quarters was equally drawn from expert interpretations reported in the dedicated publications (Hale, 2020: 139; Winkelmann et al., 2020). While the separation of Vráble into three “neighbourhoods” is quite obvious from the plan itself, the sorting of Nebelivka into “quarters” from A to N is more dependent on expert arguments and intuition. Both the recognition of houses and their sizes, as well as higher level quarters or neighbourhoods, from geomagnetic images, are tasks that may be automated in a not-too-distant future, but which remain for the time being dependent on some degree of subjective interpretation. A direct analysis of the geomagnetic images themselves would, on the other hand, include more noise from both geological features and archaeological structures from other periods (in many cases including modern ones) or features that are otherwise less related to the architectural layout of the settlement.

In order to investigate possible temporal dynamics of fractal dimension (D) and lacunarity (L) values, a subset was prepared of images with coeval settlement plans only, that is plans consisting only of features that coexisted at a given point in time. This was not possible for the Trypillia settlements, since the vast majority of houses have not been individually dated. Statistical estimations of numbers of coeval houses based on Bayesian modelling of available radiocarbon dates have been made both for Nebelivka (Millard, 2020: 253–6; Müller et al., 2022) and Maidanetske (Ohlrau, 2020: 233–5; Ohlrau, 2022: 86–8), indicating possible maximum coeval habitation at these mega-sites of about 33% and 52% of houses respectively. These analyses cannot tell, however, which houses were coeval, and thus for the time being no reliable images of coeval settlement plans can be reconstructed, even though general tendencies are proposed (Shatilo, 2021: 247; Müller et al., 2022: 218–9). For Linear Pottery houses, a dating proxy was recently proposed from the observation that house orientation drifts anti-clockwise within settlements at an approximate rate of ca. 1.3 degrees per decade (the exact linear relationship between house orientation and modelled construction date needs to be calibrated for each settlement, see Müller-Scheeßel et al., 2020). The dating model proposed for Vráble, with construction year cal. BCE as \(\frac{mean\;orientation\;+\;651.016}{0.129486}\) was applied to all houses at the settlement. House duration was set to the median of 27.5 years reported in Meadows et al., (2019), and 16 temporal samples with 20 years intervals were defined for coeval settlement plans (Fig. 2). Note that both the modelled house construction year and the house duration parameters are attributed with some level of uncertainty that is difficult to account for precisely. Variations in these values could potentially result in different numbers of houses on the coeval settlement plans, and thus interfere with the D and L estimates. A quantitatively more robust analysis could be performed by replacing the single values (construction year and house duration) with probability distributions, and generating multiple plans per time sample through bootstrapping. This would however result in a computationally far more demanding exercise, beyond the scope of this paper. The D and L estimates of the temporal slices of Vráble reported here should thus be considered as illustrative only.

The subset of synthetic settlement plans with varying levels of added noise was constructed by imitation of the empirical plans of Nebelivka and Moshuriv (Trypillia), Vráble and Úľany (Linear Pottery), as an attempt of replicating their D and L levels with the given values of house count (“n”), house-size distribution (proxied through the Gini coefficient), and basic layout (grid layout for the Linear Pottery and radial for the Trypillia settlements). The layouts were further detailed with actual distance measurements between clusters/neighbourhoods for the Linear Pottery settlements and radii for the Trypillia settlements (a single radius for the circular Moshuriv, two for the oval Nebelivka, see code script for more details). The plan images were generated in R using the ggplot2 package (Wickham, 2016), houses were rendered as circles with areas proportional to house sizes using the geom_circle()-function from the ggforce package (Pedersen & RStudio, 2022) and adding coord_fixed() to ensure identical distances on the x and y axes. Attempts were made towards representing houses as rectangles rather than circles, which for the Linear Pottery plans would be straight-forward, while resulting in overly complex code for the Trypillia plans, where houses are oriented in all directions. Initial attempts of Linear Pottery plans with houses simulated as circles and rectangles respectively made little difference in D and L estimates, however, and it was preferred to keep a single rendering method on all synthetic plans. All plot details other than houses were removed, and the plots were stored with minimal extent and 0.5 m pixel resolution. Each synthetic settlement plan was rendered in seven versions with varying levels of added noise. The zero-noise plot was an idealised imitation of the real settlement plan in each case, while the noisy plans were generated by letting each house perform a random walk, i.e. a displacement in a random direction with an arbitrary step length of five metres and a given number of steps. For each settlement, a rendering was stored with 1, 2, 4, 8, 16 and 32 steps, resulting in four sets of seven plots (including the zero noise plots), in total 28 synthetic plots (Fig. 4). This procedure is not meant to model in any way how exactly the various archaeological settlement plans were generated in the past (houses do not perform random walks), but rather to establish a set of cases with gradual shift from simple geometric forms towards randomness, in order to test to which extent the fractal dimension and lacunarity estimates are fit as quantifications of this variation. As indicated above, the synthetic plans were constructed copying the actual values of house count and house sizes as well as (to some extent) the actual measures between parts of the settlements they were meant to imitate (Fig. 5). Image density however, though obviously correlated to these variables, proved more difficult to fix when generating the plans, especially because of edge effects resulting from the voids between the rectangular minimal image frame and the irregular settlement perimeter, but also from houses walking outwards during their random walks and thus expanding the minimal frame and lowering the image density. The density parameter was thus recorded post hoc, and its influence on D and L could only be evaluated through regression analysis on the results (see Results section). In total, a series of 70 images were generated for fractal analysis (Table 1).

Fig. 4
figure 4

Synthetic plan series imitating the archaeological plans of (clockwise from top) Vráble (Fig. 1a), Moshuriv (Fig. 1g), Nebelivka (Fig. 1h) and Úľany (Fig. 1b), each in seven versions with varying levels of added spatial noise. All images were generated in R 4.2.2 using the ggplot and ggforce packages, with 0.5 m resolution. See text for further details

Fig. 5
figure 5

Values of Gini index of house sizes (a), image density (b, defined as fraction of foreground pixels in image) and house count (c) compared between the synthetic settlement plans (blue bars) and their archaeological counterparts (red bars). Gini and house count values were drawn from the archaeological plans and baked into the synthetic ones upon creation (Gini via house-size distribution), while b values were measured post hoc, leading to larger variation within the synthetic series

Table 1 Summary of image series analysed for this paper

The images were analysed for fractal dimension (D) and lacunarity (L) estimates, through the so-called box-counting and gliding box algorithms respectively. In Mandelbrot, (1982), both these values were calculated from mathematically constructed fractals, drawing from scaling characteristics of the patterns’ given initiator and generator (e.g. ibid.: 42–57). Empirical patterns, however, are generally less regular in construction, and these characteristics tend to be far more difficult if not impossible to read from the patterns. Various methods for estimating both D and L from empirical data were therefore developed and explored over the following decade (Allain & Cloitre, 1991; Klinkenberg, 1994). The box-counting method consists of covering the pattern with a set of square grids of increasing tile (i.e. box) size, while counting the number of boxes that intersect with the pattern for each box size. The size range usually lies from the pixel size to the smallest square that can cover the entire pattern, in dyadic steps (i.e. box length is doubled for each successive grid, dividing number of boxes by four; Klinkenberg, 1994: 34–36). D being defined as

$$\frac{\mathrm{log}N}{\mathrm{log}({~}^{1}\!\left/ \!{~}_{r}\right.)}$$

where N is number of elements (or box count) and r is scale (e.g. Mandelbrot, 1982: 36–7; note that Mandelbrot defined r as a fraction of the largest size rather than as a multiple of the smallest, for which D would be defined as \(\mathrm{log}N/\mathrm{log}r\)), is found by plotting the logarithm of the box count to the logarithm of box size, as the slope of the line fitted by linear regression times -1 (or \(\mathrm{log}N\sim \left(-D\right)\mathrm{log}r\)). To my knowledge, the more recent critique and adjustments to this method of fitting power-law models (cf. Clauset et al., 2009) have not yet been properly addressed in the fractal analysis literature. Box-counting in the present analysis was performed this way using the fract2D()-function from the fractD package (Mancuso, 2021). Somewhat similarly, the gliding box algorithm is performed using an array of box sizes, but instead of counting boxes in a grid that overlap with the pattern, here a single box is displaced incrementally across the image for each box size, and a given lacunarity index is calculated for each box size resulting in a distribution. Though there seems to be some variability as to exactly how this lacunarity index is calculated (see. Hingee et al., 2019: 271–2), generally it is an average spread of foreground pixel mass (i.e. black pixel count) for the given box size, given as variance divided by square mean, or

$$\lambda =1+{\left(\frac{\sigma }{\mu }\right)}^{2}$$

(cf. in set notation, Eq. 12 in Hingee et al., 2019; also Eq. 2 in Allain & Cloitre, 1991 or Eq. 2 in Cheng, 1997). Kassel Hingee and colleagues (2019) recently demonstrated that this lacunarity index is mathematically equivalent to spatial covariance, with the advantage that this allows for a far greater tolerance to both missing data and irregular observation windows, both of which being highly pervasive issues in Archaeology. Though these features were not included in the present study — as all images were created as rectangular and only settlement plans with minimal amounts of missing data were analysed — future and hopefully more generalised applications should benefit from them. L estimates were nonetheless calculated here using the gbl()-function from the lacunaritycovariance package written by the same authors (Hingee et al., 2019), and using the “GBLcc.pickaH” estimator (the default). The function only provides the distribution of the lacunarity index over the analysed scales, and not a single summary L value for the entire image, so this was calculated, as with D, as the power-law exponent of the distribution, approximated as the slope of a linear regression on log scales. Note that also here, there is no consensus as to how to summarise the lacunarity distribution across scales in a single value (Hingee et al., 2019: 271–2; Karperien, 2013). Besides the power-law exponent which is mainly used here, other methods include using the arithmetic mean of all index values or the prefactor — rather than the exponent — to the power-law approximation of the distribution. All three methods are notably proposed in the FracLac software manual (Karperien, 2013). These summary statistics for lacunarity — exponent, mean and prefactor — were included in this study in order to elucidate how they correlate with each other and with the other parameters, but most focus was given to the exponent L (hence simply L), primarily for its consistency with the method for calculating fractal dimension, but also since it was seemingly the one used in the above mentioned archaeological study by Farías-Pelayo, (2017), which served as the main impetus for the present study.

It should furthermore be noted that both the box-counting and the gliding box algorithms can be performed on fractal as well as non-fractal and multifractal data (e.g. Plotnick et al., 1996; Klinkenberg, 1994: 29), and the resulting D and L estimates do not by themselves guarantee that the analysed data actually has a fractal structure — which is why they are sometimes referred to as box-counting dimension and gliding box lacunarity. These are nevertheless useful quantifications of spatial patterns, though very little attention has been given to how they describe a pattern of regularly spaced Euclidean objects compared to a self-similar one (possibly since strictly Euclidean patterns are rare in nature). This was the motivation for the simulation part of the study presented below (Section 3.1). In the remainder of the study, the overall goal was to test whether D and L estimates could allow for differentiation of settlement plans based on culture-historical attribution (Section 3.2), or intra-site subdivision in space (quarters/neighbourhoods, Section 3.3) and time (phases, Section 3.4). For these questions, the issue of the pattern being self-similar or not is of secondary relevance.

Results

Fractal dimension (D) and lacunarity (L) estimates of the entire dataset are represented in Fig. 6, showing how all the different image series give internally homogenous results, as well as a general strong correlation between the two variables. See Tab. 2 (online repository, Data and Code Availability statement below) for the complete data table, where these are reported together with house count, image density, house-size distribution (proxied through the Gini coefficient), mean L and prefactor L. Settlement, quarter/neighbourhood and synthetic plans overlap to a large extent, while the Vráble time samples — ignoring sample 01 which represents a single house — all have lower D and higher L levels than the cumulative plan of Vráble as well as the other settlements and quarters, to some degree reflecting overall lower density and house count, but possibly also higher grid irregularity. Maidanetske represents an outlier regarding both high D and low L. The correlation plot (Fig. 7) shows that the L statistic chosen here — the exponent of a power-law approximation to the lacunarity index — has the strongest correlation to D out of the three statistics (Spearman’s coeff. = -0.85), as pointed out by others (see e.g. Equation 12 in Allain & Cloitre, 1991; implicitly in Mandelbrot, 1982: 315–7). Plotnick et al., (1996: 5463) show that this exponent will equal \(D-E\) (Euclidean dimension) for theoretical self-similar monofractals, but that scale-specific effects will cause deviations in empirical cases. The other two L statistics — mean L and prefactor L — are so correlated between each other that they should in practice be interchangeable. Both are less correlated to D (Spearman’s coeff. = -0.57) but in turn very dependent on density (-0.99, they follow power-law functions of density; see online repository for more detail). The potential use of these lacunarity measures for spatial analysis would merit further investigation. Concerning the difference between the time series and the total settlement of Vráble (Fig. 6), the results exemplify the well-known effect that temporal depth can have on how we interpret archaeological data, and show that cumulative settlement plans are not directly comparable to their temporally coeval constituents when it comes to settlement layout. In the following, I discuss the internal results and possible interpretations for each image series.

Fig. 6
figure 6

Fractal dimension (D) and lacunarity (L) estimates of all settlement images (N = 70), calculated using box-counting and gliding-box algorithms respectively. After the correlation between D and L (Spearman’s coeff. = -0.85), house count had the strongest correlation to D (0.82) and image density to L (-0.81)

Fig. 7
figure 7

Correlation table with Spearman’s (rank) correlation coefficients between variables D, L, n, density, Gini and two supplementary summary statistics of L, for all images (N = 70). All correlations except \(Gini \sim [density, {L}_{mean}and {L}_{prefactor}]\) were significant with p < 0.05

Synthetic Data

Some general observations can be made from the results for the synthetic settlement plans (Fig. 8). Though the D and L estimates of the archaeological plans do not fit perfectly within the spectre of results from their synthetic counterparts, they are all grouped at roughly the same parts of the plot. Furthermore, within each series of synthetic settlement plans, D and L estimates are roughly correlated to noise level (defined by step count in random walk). When analysed separately from the total dataset, D is here — as the only one of the included parameters — significantly correlated to noise (Spearman’s coeff. = -0.49 with p = 0.01, N = 28), indicating that it may be possible to model noise on archaeological settlement plans using the fractal dimension. Such modelling would however be premature at the present stage due to the small sample size and the grouped nature of the data. The correlations of noise ~ L (Spearman’s coeff. = 0.33, p = 0.08), and noise ~ density (Spearman’s coeff. = -0.36, p = 0.06), are weaker and not entirely significant, meaning that they would perform poorly, at least on their own, as proxies for noise.

Fig. 8
figure 8

D and L estimates for synthetic settlement plans (coloured circles, N = 28) and their archaeological counterparts (black dots, N = 4). For each synthetic group, the smallest circle corresponds to an ideal geometric settlement plan, and larger circles indicate more added spatial noise. Noise step increments were exponential from 1 to 32. The line paths within each group are added for visibility

Comparing the plot of D and L of the synthetic plans and their archaeological counterparts (Fig. 8) with the settlement plans themselves (Figs. 1 and 4), we see that as the houses are spread more randomly across the image, the fractal dimension decreases and lacunarity increases, and the weaker correlation to density indicates that this cannot be explained simply by the slight increase in image size as a few houses “wander” outwards. Random spatial noise is thus less “space filling” than all the simple geometric shapes that form the ideal settlement plans (orthogonal grids and concentric circles and ovals). Furthermore, despite differences in size and layout, the Linear Pottery settlement of Vráble (313 houses) and the Trypillia settlements of Moshuriv (84 houses) and Nebelivka (1494 houses) have similar relative distributions between built space and voids, while at Úľany (Linear Pottery, 34 houses) the empty space in the settlement is relatively bigger.

Complete Settlement Plans

When D and L estimates are compared between complete settlement plans, an important question is whether or not these can be used to distinguish between settlements of different culture-historical units. Figure 9 shows that the Trypillia sites have consistently higher D values, though with minimal margin, than the Linear Pottery sites. L values are — with the exception of Talne 3 — lower for the Trypillia settlements than the Linear Pottery settlements. This indicates — maybe somewhat counter-intuitively — that the plans of these latter ones have more variation in their layouts and open spaces and that they are less space filling. As an example, Čierne — the Linear Pottery settlement with the lowest lacunarity estimate, is the one with the lowest degree of clustering of houses into separate groups (Fig. 1c). If we extrapolate the finding from Fig. 8 — that noise is inversely correlated to D — we could interpret the settlements as ranging from noisy or irregular small Linear Pottery settlements like Úľany and Čifare towards more geometrically regular Trypillia mega-sites (Figs. 1 and 9). It has been noted in the consecrated literature that the site plan of Maidanetske is somewhat “overgrown” compared to other Trypillia mega-sites like Nebelivka (e.g. Ohlrau, 2020: 284). René Ohlrau argues that this is due to the relatively longer occupation span of the site, where towards the end there was little space left for new house construction without building on top of previously abandoned and burnt houses, thus forcing ahead more chaotic deviations from the overall regularities in the plan. Inversely, Johannes Müller and colleagues show to the relatively more rapid decline in house construction in the inner quarters of Nebelivka, leaving them more sparsely occupied than those at Maidanetske (Müller et al., 2022). It is possible that this aspect of the site plans of Maidanetske and Nebelivka is what causes them to differ in their fractal dimension estimates. This, however, seemingly contradicts the above-mentioned negative correlation between noise and fractal dimension, since Maidanetske then should be expected to be more “noisy” and thus have lower D than Nebelivka. On the other hand, the tightly packed streed grids at Maidanetske are undoubtedly more “space filling” (so higher D) than the more loose and open layout at Nebelivka, and the open central space of this latter one is larger (relatively speaking) than the one at Maidanetske, causing a more uneven distribution of gaps between houses (i.e. higher L, Fig. 1: h-i). Only larger and more systematic studies can allow for more elaboration on this point. In any case, it seems clear that the temporal depth of settlements should be a parameter to account for when analysing total settlement plans, as differing temporal depths in otherwise similar settlements can have significant effects on the appearance of their plans, as also seen in the differing results between the total plan of Vráble and those of its various temporal samples (see below).

Fig. 9
figure 9

D and L estimates for archaeological settlement plans from the Linear Pottery (LBK, red) and Trypillia cultures (blue). For this subset, no other variables than L were significantly correlated to D, but house count was correlated to L (Spearman’s coeff. = -0.77, p = 0.02, N = 9)

Spatial Subdivisions

The interest of comparing D and L values of different sections within large settlements lies in the potential for identifying quarters that have different settlement textures, possibly reflecting differences in social and/or economic organisation. This approach is increasingly being applied in Urban Science on contemporary cities, for the study of economic and functional differentiation between e.g. suburbs and city centres (Jahanmiri & Parker, 2022).

The 14 quarters of Nebelivka and the three neighbourhoods of Vráble were analysed for D and L estimates (Figs. 1a, h and 10). The spread of estimates was larger within both groups than between them, preventing cultural identification of the quarters based on these parameters alone, but rather indicating that neighbourhoodwise, the relationship between built and unbuilt space was similar in both settlements. The total site plan of Vráble lies centrally between the three neighbourhoods, and close to the South-West neighbourhood, while Nebelivka’s total estimates are towards the lower right end of the plot, with D values probably drawn up and L values down by its high house count and Gini (cf. Figs. 5 and 7). The Nebelivka quarter values actually seem to follow a spatial South-East — North-West gradient, with the high D and low L values of quarters D, F and G in the north-west end (and L in the south-east end, Fig. 1h) reflecting the dominance of grid-like house rows inside the inner main circle, while the low D and high L values of quarters A, N, M, J, K and I reflect relatively few inner houses and thus dominance of the (mostly empty) main house circles, the remaining quarters B, E, H and C being somewhere in-between. This tendency is also reflected in the correlation with both D and L for this subset with image density (Fig. 10). Another possibility that cannot yet be excluded is that this apparently spatial gradient could be only a statistical artefact due to grid orientation and the radial structure of the settlement. In the case of the Vráble neighbourhoods, it does seem that both fractal dimension and lacunarity capture some difference in internal openness between the South-East neighbourhood and the two others. Visually (Fig. 1a) it would seem that both the North and South-West neighbourhoods have more closed layouts than the South-East, but the South-West neighbourhood could possibly be drawn artificially towards the upper left of the plot (Fig. 10) because of edge effects due to its more irregular outline (i.e. including relatively more empty space in the corners of the rectangular image).

Fig. 10
figure 10

D and L estimates for separate Vráble neighbourhoods (blue) and Nebelivka quarters (red). For this subset, apart from their correlation to each other, both D and L were most strongly correlated to image density (Spearman’s coeff. = 0.72 and -0.83 respectively, both with p < 0.01, N = 17)

Temporal Subdivisions

In the present study sample, fine-grained temporal development within a large settlement could only be observed at Vráble, where house-orientation has been shown to correlate with construction date (Müller-Scheeßel et al., 2020). Given the uncertainties related to this method (see above), some observations can still be made concerning the temporal dynamics of D and L within the Vráble settlement (Fig. 11).

Fig. 11
figure 11

D and L estimates for the settlement plan of Vráble at time samples of 20 years intervals from 1 (ca. 5290 cal. BCE) to 16 (ca. 4990 cal. BCE). Image density and house count were closely correlated to D and L. Density to D and L: Spearman’s coeff. = 0.52 and -0.98 with p = 0.04 and < 0.01 respectively. House count to D and L: Spearman’s coeff. = 0.94 and -0.54 with p < 0.01 and = 0.03 respectively. Sample 1 had very deviant values and was left out (see Fig. 6). The path between points is added for readability

The temporal path of D and L of the Vráble time samples starts from the top-left and stretches towards the bottom-right in its middle (sample 9, ca. 5130 cal. BCE), before turning back again towards its point of origin. The variation in L is somewhat wider than the variation in D. Over its ca. 300 years development from ca. 5290 to 4950 cal. BCE, the size of Vráble, as measured by house count, follows a classic “battleship” curve, which in the D-L plot is well reflected by the larger samples being situated towards the bottom right of the plot — in this case, house count and density are closely related variables since in most time samples new houses appear within the already established settlement perimeter, and don’t contribute to a much larger image frame. More specifically, and ignoring the atypical samples 1, 2 and 16 that only include one or two of the three neighbourhoods, both the D and L values increase steadily over time from sample 4 (ca. 5230 cal. BCE), to a stable point at s.7 (ca. 5170), where they remain relatively unchanged until s.10 (ca. 5110) from where they decrease to another plateau at s.11–13 (ca. 5090–5050), before decreasing rapidly until s.15 (ca. 5010).

Pinpointing exactly what it is in the sampled settlement plans that follows this trajectory is, again, not obvious. It seems that the estimates are strongly linked to house count and density — as the settlement grows, lacunarity decreases and fractal dimension increases, the plan becomes more regularly “filled”, and inversely after population peak. Regarding variation in Gini index, there is little variation over time, and it seems more randomly distributed rather than correlated to other variables (see online repository, Tab. 2). If the observation on synthetic plans is general, that lower D values reflect more noise from ideal geometric plans, it could be argued that the variations in D over time at Vráble refer to variation in the visibility of the grid-like settlement layout. In fact, the early and late time samples of Vráble do seem more chaotic, while the middle time samples (roughly 7–10, or between 5170 and 5110 cal. BCE following the temporal model), house placement at construction seems to have been done increasingly with respect to already existing houses (Fig. 2). Again, this interpretation depends on the accuracy of the house orientation as temporal proxy (Müller-Scheeßel et al., 2020) as well as the modelled duration of houses (Meadows et al., 2019), the uncertainty of both being difficult to quantify. A larger simulation-based study could potentially allow for more elaboration on this point but would also be computationally far more demanding.

Note that the estimated D values for the early and late time samples 1–6 and 11–16 fall below the expected lower threshold of 1, even though they are measured on a two-dimensional plane. This can be thought of as a result of the pattern being so thinly scattered in the plane, that if all the foreground pixels were aligned in a single straight line, this would have to be interrupted into segments — like the Cantor set — for it to provide such small box-counts at the smaller scales. Mandelbrot termed such scattered spatial patterns spatial Lévy dust (e.g. Mandelbrot, 1982: 292).

Discussion

The results presented above show that the interpretation of fractal dimension and lacunarity estimates on settlement plan layouts is not a straight-forward matter. Both measures are correlated with, but not reduced to, other variables such as element count, image density, and size distribution between elements. The experimental part of this study shows that neither fractal dimension nor lacunarity estimates can be fully simulated through only these parameters, and they represent thus spatial realities which are not quantified through the other applied parameters. Furthermore, the simulation study indicates that fractal dimension does covary significantly with levels of random noise, although at the present stage there is insufficient support for extrapolating this correlation onto archaeological settlement plans. However, with these results, the possibility of modelling noise levels on settlement plans should not be disregarded, as it seems like a realistic goal given larger amounts of data. The subsequent parts of this study show that for archaeological settlement plans, whether they are analysed as they are or subdivided into spatially or temporally coherent parts, in many cases can be interpreted in terms of compactness of layouts and variability of internal open spaces through estimates of fractal dimension and lacunarity. Further study could give clearer appreciation of unwanted biases such as edge effects, grid rotation and missing data.

Interpretations of fractal dimension levels on archaeological settlement plans have until now been limited by the near absence of comparative material in the published literature. The perhaps most elaborate archaeological paper in this direction so far was written by Clifford Brown and Walter Witschey about 20 years ago (Brown & Witschey, 2003). Here they estimated the fractal dimensions of two ancient Maya settlements, including also intra-site variations, through box-counting. While the analysis itself was thorough and well argued, they interpreted a multi-level social structure from the results, simply from the fact that the obtained D values were fractions and not integers, indicative of a settlement structure consisting of “clusters of clusters of clusters” (ibid. pp. 1625–8). While their conclusions might well be correct (judging from other strains of evidence), they are not supported by the results of their fractal analysis, since any pattern — self-similar or not — that does not fill entirely its embedding Euclidean dimension, will be expected to have a fractional dimension when estimated through box-counting. In this study, I have shown how very different-looking settlement plans may result in nearly identical fractal dimension values (Fig. 9). While adding lacunarity to the analysis does improve the quantitative characterisation of irregular patterns, the two parameters did not suffice alone to convincingly distinguish between Trypillia and Linear Pottery settlement plans. As another example of somewhat premature interpretations of fractal dimension measures of settlement plans, Klaudia Oleschko and colleagues, in their innovative study of the Teotihuacán plan, highlighted the similarity of the D value of the Ciudadela quarter and the theoretical fractal known as the Sierpinski Carpet at D ≈ 1.89 (Oleschko et al., 2000). Again, a single value of fractal dimension can result from very different patterns, and the result is less convincing due to the applied methodology: the researchers had used aerial and satellite photographs in the analysis, rather than simplified binary maps, in such a way that it is likely that the result was heavily influenced by other factors such as vegetation, roads and paths as well as shadows. In the present study, I have opted for interpreted plans with buildings rendered as black filled polygons, while Brown and Witschey, (2003) opted for empty building outlines only. Both approaches have arguments for and against, but the influence on estimated D and L values remain for the time being unknown. Finally, yet an unfounded assumption regarding fractal dimension values is that since they represent a measure of geometric complexity in a certain sense, they are expected to increase over time. Castrejón-Pita et al., (2003) proposed the use of fractal dimension as a proxy for relative time in order to construct a chronology of the notoriously difficult-to-date Nasca geoglyphs in Peru, suggesting that the making of low-dimensional images of parallel lines and simple geometric forms would have been a technical prerequisite to later creations of more complex and naturalistic imagery. However intuitive this idea may seem, counter-examples abound throughout History and Prehistory. Without the examples being otherwise comparable, the case presented above of the temporal development of the Neolithic settlement plan of Vráble does at least demonstrate that dimensional values of human constructed features can both increase and decrease over time.

The question of spatial variation in D and L measures within patterns seems particularly promising, in that it may serve to identify quarters or neighbourhoods with differing socio-economic functions. This methodology may however be more fitting for later periods where there are clearer signs of spatial differentiation within societies, e.g. between administrative, sacred, residential and artisanal or industrial areas. On the other hand, systematic use of fractal analysis precisely within contexts where such spatial differentiation is debated — like the European Neolithic — may in the end prove more useful. Application of this methodology on later contexts, e.g. Roman urban contexts, could serve to demonstrate more unambiguously that the method actually does what it promises. Mapping of varying fractal dimension and lacunarity values across an image can be done by dividing the image into a grid, with arbitrary tile sizes depending on desired resolution, and perform the box-counting and gliding box algorithms within each tile. The investigation of variations in dimension across scale, rather than space, within a pattern, is called multifractal analysis, and has to my knowledge never been attempted in Archaeology (see Karperien, 2013, for an introduction to the open-source software plugin FracLac for ImageJ, that can be used for this purpose). These approaches could potentially become useful in image recognition and computer vision applications on remote sensing imagery. There is an ongoing revolution in the domain of automated detection of archaeological sites and structures (e.g. Guyot et al., 2021; Olivier & Vaart, 2021), and the inclusion of fractal analysis methods to the applied under-the-hood machine learning processes could possibly allow for the detection and automated interpretation of more complex archaeological remains like architectural ensembles. A significant challenge here, however, is the associated computational cost of the analyses.

The differing results between the temporally coeval plans of Vráble and the total (temporally accumulated) settlement plans show that fractal dimension values of such data sets are not directly comparable with each other. It is — as always with intra-site spatial analyses — vital to always recognise the temporal depth and resolution of the data set one is analysing, and thus to construct one’s research questions accordingly (Perreault, 2019). I would argue that questions of spatial complexity and noise relative to scales of social organisation are better investigated through data sets consisting of coeval or near coeval settlement plans — which can admittedly be hard to produce in many archaeological settings — while total plans of settlements could potentially serve for image recognition purposes and automated culture-historical interpretation. As suggested by the above results (Fig. 9), differences in fractal dimension may even be indicative — all other things being equal — of temporal depth itself: two characteristics that distinguish Maidanetske from Nebelivka being that the former has a site-plan that has been described as “overgrown”, while also being a site with a longer occupation span than the latter (Ohlrau, 2020). This is also a possibility that merits further investigation.

Finally, the perhaps major obstacle to a more widespread use of fractal analysis among archaeologists (as well as other humanists and social scientists) has until now been the overly specialised and somewhat opaque nature of the methodology. As for now, box-counting and gliding-box algorithms are not implemented in standard GIS software, and more specialised software or coding is necessary to perform the analysis. In a detailed assessment of different available softwares using archaeological and synthetic data, Sabrina Farías-Pelayo reported significant discrepancies in the obtained results on identical images, caused mainly by differing algorithms used to convert images into binary format (Farías-Pelayo, 2015: 124–140). It is worth noting that the available softwares are mostly developed in research labs and oriented towards usage within specific disciplines, and that software for more general use is still largely non-existent. The above mentioned FracLac plugin remains perhaps the most commonly used, due to its relative user-friendliness and the wide array of proposed analyses (Karperien, 2013). The analyses presented in the present paper have been performed through a custom-made script in the open source programming language R (see online repository), hopefully enhancing reproducibility of the results, as well as encouraging further development. However, also when opting for direct coding, several issues persist. The two packages used here for box-counting and gliding box functions — fractD (Mancuso, 2021) and lacunaritycovariance (Hingee et al., 2019) respectively — are recently and independently developed, and are imperfectly compatible with each other. Further streamlining these, and adding functionalities like local and multifractal scans, would certainly increase their usefulness and applicability.

Conclusion

In the present study, fractal dimension and lacunarity estimates were calculated on a series of 70 binary images of archaeological and synthetic settlement plans, allowing for an in-depth empirical exploration of how these parameters relate to factors such as settlement layout, size, density and house-size distribution, as well as the more elusive factor of noise or irregularity from simple geometric patterns. A significant negative correlation between fractal dimension and noise was observed, indicating that it could potentially serve as a quantitative proxy for this in comparative studies. Furthermore, in comparisons between archaeological settlement plans of various layouts, fractal dimension seemed to correspond well to the compactness of buildings and the internal coherence of the geometric layout, while lacunarity described the variability of gaps between buildings and groups of buildings. This was illustrated by Trypillia settlements having generally higher fractal dimension and lower lacunarity than Linear Pottery settlements, but also with relatively large variation within each culture group. For the Trypillia mega-site of Nebelivka, the fractal analysis allowed for a quantitative differentiation of internal quarters between those that were dominated by internal streets and those dominated by the main circular street, and for the Linear Pottery settlement of Vráble, fractal analysis indicated a process of filling and emptying of the settlement’s internal layout over time. While much remains before the nature of these factors are fully understood, it seems clear that fractal analysis methods are well suited for investigations on the variability of textures in archaeological settlement plans. It is furthermore argued here that settlement textures can be indicative of a range of underlying socio-cultural factors, like social hierarchies, settlement planning, household autonomy and economic function, in addition to more primary chrono-cultural characteristics such as culture groups, settlement types and duration, illustrating the importance and wide potential of these methods. Fractal analysis in the wider sense — including methods that are not explored in this paper — holds even more possibilities that need to be further explored by archaeologists (Diachenko, 2018). This insight has now been recognised by specialists for several decades (Zubrow, 1985), but maybe the current democratisation of open-source programming (Schmidt & Marwick, 2020) is what is needed for the fractalist framework to be further integrated into archaeological method and theory.