Sign in to use this feature.

Years

Between: -

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,107)

Search Parameters:
Journal = Data

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 4309 KiB  
Article
Observational Monitoring Records Downstream Impacts of Beaver Dams on Water Quality and Quantity in Temperate Mixed-Land-Use Watersheds
by Erin E. Novobilsky, Jack R. Navin, Deon H. Knights and P. Zion Klos
Data 2025, 10(4), 51; https://doi.org/10.3390/data10040051 (registering DOI) - 7 Apr 2025
Abstract
Beaver populations in the U.S. northeast are rising, increasing the number of beaver dams and ponds in suburban watersheds. These new beaver ponds may impact the way that harmful algal blooms occur by changing biogeochemical cycling and sediment characteristics. In this study, piezometers, [...] Read more.
Beaver populations in the U.S. northeast are rising, increasing the number of beaver dams and ponds in suburban watersheds. These new beaver ponds may impact the way that harmful algal blooms occur by changing biogeochemical cycling and sediment characteristics. In this study, piezometers, installed upstream and downstream of multiple dam structures were used to evaluate changes in nitrate and orthophosphate concentrations in surface and hyporheic water. Data were also collected with seepage meters, discharge measurements, lab and field-based analytical tests, and sediment samples. These were collected from beaver dams and paired non-beaver dams upstream of unimpounded reaches to look at the potential for dormant sediment-based cyanobacteria to bloom and produce toxins under ideal light and nutrient levels. Results indicate a significant increase in orthophosphate from upstream to downstream of beaver dams. Results also demonstrate that toxin potential did not increase between cyanobacteria in beaver pond sediment and the paired unimpounded sample; however, under ideal light and nutrient levels, sediment from a beaver dam led to faster cyanobacterial growth. These findings highlight that while beaver dams and impoundments function as nutrient sinks within the tributary watersheds, there are potential risks from downstream transport of bloom-inducing sediment following a dam collapse. Full article
Show Figures

Figure 1

13 pages, 633 KiB  
Article
Sentiment Matters for Cryptocurrencies: Evidence from Tweets
by Radu Lupu and Paul Cristian Donoiu
Data 2025, 10(4), 50; https://doi.org/10.3390/data10040050 - 1 Apr 2025
Viewed by 160
Abstract
This study provides empirical evidence that cryptocurrency market movements are influenced by sentiment extracted from social media. Using a high frequency dataset covering four major cryptocurrencies (Bitcoin, Ether, Litecoin, and Ripple) from October 2017 to September 2021, we apply state-of-the-art natural language processing [...] Read more.
This study provides empirical evidence that cryptocurrency market movements are influenced by sentiment extracted from social media. Using a high frequency dataset covering four major cryptocurrencies (Bitcoin, Ether, Litecoin, and Ripple) from October 2017 to September 2021, we apply state-of-the-art natural language processing techniques on tweets from influential Twitter accounts. We classify sentiment into positive, negative, and neutral categories and analyze its effects on log returns, liquidity, and price jumps by examining market reactions around tweet occurrences. Our findings show that tweets significantly impact trading volume and liquidity: neutral sentiment tweets enhance liquidity consistently, negative sentiments prompt immediate volatility spikes, and positive sentiments exert a delayed yet lasting influence on the market. This highlights the critical role of social media sentiment in influencing intraday market dynamics and extends the research on sentiment-driven market efficiency. Full article
Show Figures

Figure 1

13 pages, 219 KiB  
Data Descriptor
Predictors of Immune Fitness and the Alcohol Hangover: Survey Data from UK and Irish Adults
by Joris C. Verster, Agnese Merlo, Maureen N. Zijlstra, Benthe R. C. van der Weij, Anne S. Boogaard, Sanne E. Schulz, Jessica Balikji, Andy J. Kim, Sherry H. Stewart, Simon B. Sherry, Johan Garssen, Gillian Bruce and Lydia E. Devenney
Data 2025, 10(4), 49; https://doi.org/10.3390/data10040049 - 1 Apr 2025
Viewed by 65
Abstract
Immune fitness is defined as the capacity of the body to respond to health challenges (such as infections) by activating an appropriate immune response to promote health and prevent and resolve disease, which is essential for improving quality of life. Thus, immune fitness [...] Read more.
Immune fitness is defined as the capacity of the body to respond to health challenges (such as infections) by activating an appropriate immune response to promote health and prevent and resolve disease, which is essential for improving quality of life. Thus, immune fitness plays an essential role in health, and reduced immune fitness may be an important signal of increased susceptibility for disease. Lifestyle factors such as increased levels of alcohol consumption have been shown to negatively impact immune fitness. The alcohol hangover is the most frequently reported negative consequence of alcohol consumption and is defined as the combination of negative mental and physical symptoms, which can be experienced after a single episode of alcohol consumption, starting when blood alcohol concentration (BAC) approaches zero. Significant correlations have been reported between hangover severity and both immune fitness and biomarkers of systemic inflammation. The concepts of immune fitness and alcohol hangover are further linked by the fact that the inflammatory response to alcohol consumption plays an important role in the pathology of the alcohol hangover. Moreover, immune fitness has been related to the susceptibility of experiencing hangovers per se. It is therefore important to investigate the interrelationship between immune fitness and the alcohol hangover, and to identify possible predictor variables of both constructs. This data descriptor article describes a study that was conducted with adults living in the UK or Ireland, evaluating possible correlates and predictors of immune fitness and the alcohol hangover. Data on mood, personality, mental resilience, pain catastrophizing, and sleep were collected from n = 1178 participants through an online survey. Herein, the survey and corresponding dataset are described. Full article
5 pages, 638 KiB  
Data Descriptor
Plankton Dataset During Austral Spring and Summer in the Valdés Biosphere Reserve, Patagonia, Argentina
by Ariadna Celina Nocera, Maité Latorre, Valeria Carina D’Agostino, Brenda Temperoni, Carla Derisio, María Sofía Dutto, Anabela Berasategui, Irene Ruth Schloss and Rodrigo Javier Gonçalves
Data 2025, 10(4), 48; https://doi.org/10.3390/data10040048 (registering DOI) - 31 Mar 2025
Viewed by 45
Abstract
The present dataset served to evaluate the plankton community composition and abundance in Nuevo Gulf (42°42′ S, 64°30′ W), a World Heritage Site in Argentinian Patagonia and part of the Valdés Biosphere Reserve. It reports zooplankton abundance (>300 µm) and phytoplankton concentration (10–200 [...] Read more.
The present dataset served to evaluate the plankton community composition and abundance in Nuevo Gulf (42°42′ S, 64°30′ W), a World Heritage Site in Argentinian Patagonia and part of the Valdés Biosphere Reserve. It reports zooplankton abundance (>300 µm) and phytoplankton concentration (10–200 μm) during the spring and summer seasons from 2019 to 2021. Special attention was given to the taxonomic classification of zooplankton, leading to the first identification of jellyfish species within the Gulf and the detection of an unreported copepod for the area (Drepanopus forcipatus). Samples were collected at two depths—a surface and a deeper layer—to assess vertical distribution patterns of plankton communities and explore potential environmental drivers influencing their variability. This dataset provides a valuable baseline for future studies analyzing temporal variations in the Gulf’s plankton communities. Moreover, it encourages the local scientific community to contribute data and promote open access to marine biodiversity records in the region. Full article
Show Figures

Figure 1

11 pages, 421 KiB  
Data Descriptor
A Comprehensive Monte Carlo-Simulated Dataset of WAXD Patterns of Wood Cellulose Microfibrils
by Ricardo Baettig and Ben Ingram
Data 2025, 10(4), 47; https://doi.org/10.3390/data10040047 - 29 Mar 2025
Viewed by 112
Abstract
Wide-angle X-ray diffraction analysis is a powerful tool for investigating the structure and orientation of cellulose microfibrils in plant cell walls, but the complex relationship between diffraction patterns and underlying structural parameters remains challenging to both understand and validate. This study presents a [...] Read more.
Wide-angle X-ray diffraction analysis is a powerful tool for investigating the structure and orientation of cellulose microfibrils in plant cell walls, but the complex relationship between diffraction patterns and underlying structural parameters remains challenging to both understand and validate. This study presents a comprehensive dataset of 81,906 Monte Carlo-simulated wide-angle X-ray diffraction patterns for the cellulose Iβ 200 lattice. The dataset was generated using a mechanistic, physically informed simulation procedure that incorporates realistic cell wall geometries from wood anatomy, including circular and polygonal fibers, and accounts for the full range of crystallographic and anatomical parameters influencing diffraction patterns. Each simulated pattern required multiple nested Monte Carlo iterations, totaling approximately 10 million calculations per pattern. The resulting dataset pairs each diffraction pattern with its exact generating parameter set, including mean microfibril angle (MFA), MFA variability, fiber tilt angles, and cell wall cross-sectional shape. The dataset addresses a significant barrier in the field—the lack of validated reference data with known ground truth values for testing and developing new analytical methods. It enables the development, validation, and benchmarking of novel algorithms and machine learning models for MFA prediction from diffraction patterns. The simulated data also allow for systematic investigation of the effects of geometric factors on diffraction patterns and serves as an educational resource for visualizing structure–diffraction relationships. Despite some limitations, such as assuming ideal diffraction conditions and focusing primarily on the S2 cell wall layer, this dataset provides a valuable foundation for advancing X-ray diffraction analysis methods for cellulose microfibril architecture characterization in plant cell walls. Full article
Show Figures

Figure 1

13 pages, 4706 KiB  
Data Descriptor
River Restoration Units: Riverscape Units for European Freshwater Ecosystem Management
by Gonçalo Duarte, Angeliki Peponi, Pedro Segurado, Tamara Leite, Florian Borgwardt, Andrea Funk, Sebastian Birk, Maria Teresa Ferreira and Paulo Branco
Data 2025, 10(4), 46; https://doi.org/10.3390/data10040046 - 28 Mar 2025
Viewed by 117
Abstract
Freshwater habitats and biota are among the most threatened worldwide. In Europe, significant efforts are being taken to counteract detrimental human impacts on nature. In line with these efforts, the MERLIN project funded by the H2020 program focuses on mainstreaming ecosystem restoration for [...] Read more.
Freshwater habitats and biota are among the most threatened worldwide. In Europe, significant efforts are being taken to counteract detrimental human impacts on nature. In line with these efforts, the MERLIN project funded by the H2020 program focuses on mainstreaming ecosystem restoration for freshwater-related environments at the landscape scale. Additionally, the Dammed Fish project focuses on one of the main threats affecting European Networks—artificial fragmentation of the river. Meeting the objectives of both projects to work on a large, pan-European scale, we developed a novel spatial database for river units. These spatial units, named River Restoration Units (R2Us), abide by river network functioning while creating the possibility of aggregating multiple data sources with varying resolutions to size-wise comparable units. To create the R2U, we set a methodological framework that departs from the Catchment Characterization and Modelling—River and Catchment Database v2.1 (CCM2)—together with the capabilities of the River Network Toolkit (v2) software (RivTool) to implement a seven-step methodological procedure. This enabled the creation of 11,557 R2U units in European sea outlet river basins along with their attributes. Procedure outputs were associated with spatial layers and then reorganized to create a relational database with normalized data. Under the MERLIN project, R2Us have been used as the spatial analysis unit for a large-scale analysis using multiple input datasets (e.g., ecosystem services, climate, and European Directive reporting data). This database will be valuable for river management and conservation planning, being particularly well suited for large-scale restoration planning in accordance with European Nature legislation. Full article
(This article belongs to the Topic Intersection Between Macroecology and Data Science)
Show Figures

Figure 1

23 pages, 8564 KiB  
Article
A Benchmark Dataset for the Validation of Phase-Based Motion Magnification-Based Experimental Modal Analysis
by Pierpaolo Dragonetti, Marco Civera, Gaetano Miraglia and Rosario Ceravolo
Data 2025, 10(4), 45; https://doi.org/10.3390/data10040045 - 27 Mar 2025
Viewed by 102
Abstract
In recent years, the development of computer vision technology has led to significant implementations of non-contact structural identification. This study investigates the performance offered by the Phase-Based Motion Magnification (PBMM) algorithm, which employs video acquisitions to estimate the displacements of target pixels and [...] Read more.
In recent years, the development of computer vision technology has led to significant implementations of non-contact structural identification. This study investigates the performance offered by the Phase-Based Motion Magnification (PBMM) algorithm, which employs video acquisitions to estimate the displacements of target pixels and amplify vibrations occurring within a desired frequency band. Using low-cost acquisition setups, this technique can potentially replace the pointwise measurements provided by traditional contact sensors. The main novelty of this experimental research is the validation of PBMM-based experimental modal analyses on multi-storey frame structures with different stiffnesses, considering six structural layouts with different configurations of diagonal bracings. The PBMM results, both in terms of time series and identified modal parameters, are validated against benchmarks provided by an array of physically attached accelerometers. In addition, the influence of pixel intensity on estimates’ accuracy is investigated. Although the PBMM method shows limitations due to the low frame rates of the commercial cameras employed, along with an increase in the signal-to-noise ratio in correspondence of bracing nodes, this method turned out to be effective in modal identification for structures with modest variations in stiffness in terms of height. Moreover, the algorithm exhibits modest sensitivity to pixel intensity. An open access dataset containing video and sensor data recorded during the experiments, is available to support further research at the following https://doi.org/10.5281/zenodo.10412857. Full article
Show Figures

Figure 1

8 pages, 3288 KiB  
Data Descriptor
Experimental Dataset for Fiber Optic Specklegram Sensing Under Thermal Conditions and Use in a Deep Learning Interrogation Scheme
by Francisco J. Vélez, Juan D. Arango, Víctor H. Aristizábal, Carlos Trujillo and Jorge A. Herrera-Ramírez
Data 2025, 10(4), 44; https://doi.org/10.3390/data10040044 - 26 Mar 2025
Viewed by 84
Abstract
This dataset comprises specklegram images acquired from a multimode optical fiber subjected to varying thermal conditions. Designed for training neural networks focused on developing Fiber Optic Specklegram Sensors (FSSs), these experimental data enable the detection of changes in speckle patterns corresponding to applied [...] Read more.
This dataset comprises specklegram images acquired from a multimode optical fiber subjected to varying thermal conditions. Designed for training neural networks focused on developing Fiber Optic Specklegram Sensors (FSSs), these experimental data enable the detection of changes in speckle patterns corresponding to applied temperature variations. The dataset includes 24,528 images captured over a temperature range from 25 °C to 200 °C, with incremental steps of approximately 0.175 °C. Key acquisition parameters include a wavelength of 633 nm, a sensing zone length of 20 mm, and a multimode fiber with a core diameter of 62.5 μm. This dataset supports developing and validating temperature-sensing models using fiber optic technology and can facilitate benchmarking against other experimental or synthetic datasets. Finally, an implementation is presented for utilizing the dataset in a deep learning interrogation scheme. Full article
Show Figures

Figure 1

22 pages, 2388 KiB  
Article
Improved Script Identification Algorithm Using Unicode-Based Regular Expression Matching Strategy
by Mamtimin Qasim and Wushour Silamu
Data 2025, 10(4), 43; https://doi.org/10.3390/data10040043 - 25 Mar 2025
Viewed by 154
Abstract
While script identification is the first step in many natural language processing and text mining tasks, at present, there is no open-source script identification algorithm for text. For this reason, we analyze the Unicode encoding of each type of script and construct regular [...] Read more.
While script identification is the first step in many natural language processing and text mining tasks, at present, there is no open-source script identification algorithm for text. For this reason, we analyze the Unicode encoding of each type of script and construct regular expressions in this study, in order to design an improved script identification algorithm. Because some scripts share common characters, it’s impossible to count and summarize them. As a result, some extracted scripts are incomplete, which affects subsequent text processing tasks; furthermore, if a new script identification feature is required, the regular expression for each script must be re-adjusted. To improve the performance and scalability of script identification, we analyze the encoding range of each script provided on the official Unicode website and identify the shared characters, allowing us to design an improved script identification algorithm. Using this approach, we can fully consider all 169 Unicode script types. The proposed method is scalable and does not require numbers, punctuation marks, or other symbols to be filtered during script identification; furthermore, these items in the text are also included in the script identification results, thus ensuring the integrity of the provided information. The experimental results show that the proposed algorithm performs almost as well as our previous script identification algorithm while providing improvements on its basis. Full article
(This article belongs to the Section Information Systems and Data Management)
Show Figures

Figure 1

12 pages, 2320 KiB  
Data Descriptor
Linking Fungal Genomics to Thermal Growth Limits: A Dataset of 730 Sequenced Species
by William Bains
Data 2025, 10(4), 42; https://doi.org/10.3390/data10040042 - 25 Mar 2025
Viewed by 179
Abstract
The response of fungal species to changes in temperature is of theoretical and practical importance in a world of changing temperatures, ecologies and populations. Genomic sequencing to identify fungal species and their potential metabolic capabilities is well established, but linking this to growth [...] Read more.
The response of fungal species to changes in temperature is of theoretical and practical importance in a world of changing temperatures, ecologies and populations. Genomic sequencing to identify fungal species and their potential metabolic capabilities is well established, but linking this to growth temperature conditions has been limited. To that end, I describe a dataset that brings together the maximum and minimum temperature growth limits for 730 species of Fungi and Oomycetes for which genome sequences are available, together with supporting proteome and taxonomic data and literature references. The set will provide an entry for studies into how genomic structure and sequence can be used to predict the potential for growth at low or high temperatures, and hence the potential industrial use or pathogenic liability of existing or new fungal species. Full article
Show Figures

Figure 1

5 pages, 1513 KiB  
Data Descriptor
Terrestrial Carbon Storage Estimation in Guangdong Province (2000–2021)
by Wei Wang, Yueming Hu, Xiaoyun Mao, Ying Zhang, Liangbo Tang and Junxing Cai
Data 2025, 10(4), 41; https://doi.org/10.3390/data10040041 - 25 Mar 2025
Viewed by 155
Abstract
(1) Terrestrial ecosystems are critical carbon sinks, and the accurate assessment of their carbon storage is vital for understanding global carbon cycles and formulating climate change mitigation strategies. (2) This study integrated vegetation indices, meteorological factors, land use data, soil/vegetation types, field sampling, [...] Read more.
(1) Terrestrial ecosystems are critical carbon sinks, and the accurate assessment of their carbon storage is vital for understanding global carbon cycles and formulating climate change mitigation strategies. (2) This study integrated vegetation indices, meteorological factors, land use data, soil/vegetation types, field sampling, and a convolutional neural network (CNN) model to estimate the carbon storage of terrestrial ecosystems in Guangdong Province. (3) Total carbon storage increased by 0.11 Pg from 2000 to 2021, with vegetation carbon gains (+0.19 Pg) offsetting soil carbon losses (−0.08 Pg), with the latter primarily being driven by reduced soil carbon in forest ecosystems. (4) Northern and eastern Guangdong exhibit high potential for enhancing carbon storage capacity, which is crucial for achieving regional carbon peaking and neutrality targets. Full article
(This article belongs to the Section Spatial Data Science and Digital Earth)
Show Figures

Figure 1

10 pages, 2466 KiB  
Data Descriptor
Analysis of Minerals Using Handheld Laser-Induced Breakdown Spectroscopy Technology
by Naila Mezoued, Cécile Fabre, Jean Cauzid, YongHwi Kim and Marjolène Jatteau
Data 2025, 10(3), 40; https://doi.org/10.3390/data10030040 - 20 Mar 2025
Viewed by 198
Abstract
Laser-induced breakdown spectroscopy (LIBS), a rapid and versatile analytical technique, is becoming increasingly widespread within the geoscience community. Suitable for fieldwork analyses using handheld analyzers, the elemental composition of a sample is revealed by generating plasma using a high-energy laser, providing a practical [...] Read more.
Laser-induced breakdown spectroscopy (LIBS), a rapid and versatile analytical technique, is becoming increasingly widespread within the geoscience community. Suitable for fieldwork analyses using handheld analyzers, the elemental composition of a sample is revealed by generating plasma using a high-energy laser, providing a practical solution to numerous geological challenges, including identifying and discriminating between different mineral phases. This data paper presents over 12,000 reference mineral spectra acquired using a handheld LIBS analyzer (© SciAps), including those of silicates (e.g., beryl, quartz, micas, spodumene, vesuvianite, etc.), carbonates (e.g., dolomite, magnesite, aragonite), phosphates (e.g., amblygonite, apatite, topaz), oxides (e.g., hematite, magnetite, rutile, chromite, wolframite), sulfates (e.g., baryte, gypsum), sulfides (e.g., chalcopyrite, pyrite, pyrrhotite), halides (e.g., fluorite), and native elements (e.g., sulfur and copper). The datasets were collected from 170 pure mineral samples in the form of crystals, powders, and rock specimens, during three research projects: NEXT, Labex Ressources 21, and ARTeMIS. The extensive spectral range covered by the analyzer spectrometers (190–950 nm) allowed for the detection of both major (>1 wt.%) and trace (<1 wt.%) elements, recording a unique spectral signature for each mineral. Mineral spectra can serve as reference data to (i) identify relevant emission lines and spectral ranges for specific minerals, (ii) be compared to unknown LIBS spectra for mineral identification, or (iii) constitute input data for machine learning algorithms. Full article
Show Figures

Figure 1

30 pages, 5472 KiB  
Data Descriptor
The 1688 Sannio–Matese Earthquake: A Dataset of Environmental Effects Based on the ESI-07 Scale
by Angelica Capozzoli, Valeria Paoletti, Sabina Porfido, Alessandro Maria Michetti and Rosa Nappi
Data 2025, 10(3), 39; https://doi.org/10.3390/data10030039 - 19 Mar 2025
Viewed by 516
Abstract
The 1688 Sannio–Matese earthquake, with a macroseismically derived magnitude of Mw = 7 and an epicentral intensity of IMCS = XI, had a deep impact on Southern Italy, causing thousands of casualties, extensive damage and significant environmental effects (EEEs) in the [...] Read more.
The 1688 Sannio–Matese earthquake, with a macroseismically derived magnitude of Mw = 7 and an epicentral intensity of IMCS = XI, had a deep impact on Southern Italy, causing thousands of casualties, extensive damage and significant environmental effects (EEEs) in the epicentral area. Despite a comprehensive knowledge of its economic and social impacts, information regarding the earthquake’s environmental effects remains poorly studied and far from complete, hindering accurate intensity calculations by the Environmental Seismic Intensity Scale (ESI-07). This study aims to address this knowledge gap by compiling a thorough dataset of the EEEs induced by the earthquake. By consulting over one hundred historical, geological and scientific reports, we have collected and classified, using the ESI-07 scale, its primary and secondary EEEs, most of which were previously undocumented in the literature. We verified the historical sources regarding some of these effects through reconnaissance field mapping. Analysis of the obtained dataset reveals some primary effects (surface faulting) and extensive secondary effects, such as slope movements, ground cracks, hydrological anomalies, liquefaction and gas exhalation, which affected numerous towns. These findings enabled us to reassess the Sannio earthquake intensity, considering its environmental impact and comparing traditional macroseismic scales with the ESI-07. Our analysis allowed us to provide an epicentral intensity ESI of I = X, one degree lower than the published IMCS = XI. This study highlights the importance of combining traditional scales with the ESI-07 for more accurate hazard assessments. The macroseismic revision provides valuable insights for seismic hazard evaluation and land-use planning in the Sannio–Matese region, especially considering the distribution of the secondary effects. Full article
Show Figures

Figure 1

14 pages, 2091 KiB  
Data Descriptor
Historical Hourly Information of Four European Wind Farms for Wind Energy Forecasting and Maintenance
by Javier Sánchez-Soriano, Pedro Jose Paniagua-Falo and Carlos Quiterio Gómez Muñoz
Data 2025, 10(3), 38; https://doi.org/10.3390/data10030038 - 19 Mar 2025
Viewed by 216
Abstract
For an electric company, having an accurate forecast of the expected electrical production and maintenance from its wind farms is crucial. This information is essential for operating in various existing markets, such as the Iberian Energy Market Operator—Spanish Hub (OMIE in its Spanish [...] Read more.
For an electric company, having an accurate forecast of the expected electrical production and maintenance from its wind farms is crucial. This information is essential for operating in various existing markets, such as the Iberian Energy Market Operator—Spanish Hub (OMIE in its Spanish acronym), the Portuguese Hub (OMIP in its Spanish acronym), and the Iberian electricity market between the Kingdom of Spain and the Portuguese Republic (MIBEL in its Spanish acronym), among others. The accuracy of these forecasts is vital for estimating the costs and benefits of handling electricity. This article explains the process of creating the complete dataset, which includes the acquisition of the hourly information of four European wind farms as well as a description of the structure and content of the dataset, which amounts to 2 years of hourly information. The wind farms are in three countries: Auvergne-Rhône-Alpes (France), Aragon (Spain), and the Piemonte region (Italy). The dataset was built and validated following the CRISP-DM methodology, ensuring a structured and replicable approach to data processing and preparation. To confirm its reliability, the dataset was tested using a basic predictive model, demonstrating its suitability for wind energy forecasting and maintenance optimization. The dataset presented is available and accessible for improving the forecasting and management of wind farms, especially for the detection of faults and the elaboration of a preventive maintenance plan. Full article
Show Figures

Figure 1

15 pages, 1302 KiB  
Data Descriptor
Experimental Parametric Forecast of Solar Energy over Time: Sample Data Descriptor
by Fernando Venâncio Mucomole, Carlos Augusto Santos Silva and Lourenço Lázaro Magaia
Data 2025, 10(3), 37; https://doi.org/10.3390/data10030037 - 17 Mar 2025
Viewed by 197
Abstract
Variations in solar energy when it reaches the Earth impact the production of photovoltaic (PV) solar plants and, in turn, the dynamics of clean energy expansion. This incentivizes the objective of experimentally forecasting solar energy by parametric models, the results of which are [...] Read more.
Variations in solar energy when it reaches the Earth impact the production of photovoltaic (PV) solar plants and, in turn, the dynamics of clean energy expansion. This incentivizes the objective of experimentally forecasting solar energy by parametric models, the results of which are then refined by machine learning methods (MLMs). To estimate solar energy, parametric models consider all atmospheric, climatic, geographic, and spatiotemporal factors that influence decreases in solar energy. In this study, data on ozone, evenly mixed gases, water vapor, aerosols, and solar radiation were gathered throughout the year in the mid-north area of Mozambique. The results show that the calculated solar energy was close to the theoretical solar energy under a clear sky. When paired with MLMs, the clear-sky index had a correlational order of 0.98, with most full-sun days having intermediate and clear-sky types. This suggests the potential of this area for PV use, with high correlation and regression coefficients in the range of 0.86 and 0.89 and a measurement error in the range of 0.25. We conclude that evenly mixed gases and the ozone layer have considerable influence on transmittance. However, the parametrically forecasted solar energy is close to the energy forecasted by the theoretical model. By adjusting the local characteristics, the model can be used in diverse contexts to increase PV plants’ electrical power output efficiency. Full article
(This article belongs to the Topic Smart Energy Systems, 2nd Edition)
Show Figures

Figure 1

Back to TopTop