| Title: | Descriptive Analysis and Visualization for Panel Data |
|---|---|
| Description: | Provides a comprehensive set of tools for describing and visualizing panel data structures, as well as for summarizing and visualizing variables within a panel data context. |
| Authors: | Dmitrii Tereshchenko [aut, cre] (ORCID: <https://orcid.org/0000-0002-8973-542X>) |
| Maintainer: | Dmitrii Tereshchenko <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.1 |
| Built: | 2026-05-30 18:04:41 UTC |
| Source: | https://github.com/dtereshch/paneldesc |
This function performs one-way tabulations and decomposes counts into between and within components for categorical (factor) variables in panel data.
decompose_factor( data, select = NULL, index = NULL, format = "wide", digits = 3 )decompose_factor( data, select = NULL, index = NULL, format = "wide", digits = 3 )
data |
A data.frame containing panel data in a long format. |
select |
A character vector specifying which categorical (factor) variables to analyze. If not specified, all factor variables in the data.frame will be used. |
index |
A character vector of length 1 or 2 specifying the names of the entity and (optionally) time variables. The first element is the entity variable; if a second element is provided, it is used as the time variable. Not required if data has panel attributes. |
format |
A character string specifying the output format: "wide" or "long". Default = "wide". |
digits |
An integer indicating the number of decimal places to round shares. Default = 3. |
The output format is controlled by the format parameter.
When format = "wide" (default), returns a data.frame with columns:
variableThe name of the analyzed variable
categoryThe category level of the variable
count_overallOverall frequency (person-time observations)
share_overallOverall share (count_overall / total_obs)
count_betweenBetween-entity frequency (number of entities ever having this category)
share_betweenBetween-entity share (count_between / total_entities)
share_withinWithin-entity share (average share of time entities have this category)
When format = "long", returns a data.frame with columns:
variableThe name of the analyzed variable
categoryThe category level of the variable
dimensionType of decomposition: "overall", "between", or "within"
countFrequency count (NA for within dimension)
shareShare proportion (0 to 1)
The object has class "panel_summary" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList containing additional information: count_entities.
A data.frame with categorical panel data decomposition statistics.
For Stata users: This corresponds to the xttab command.
See also decompose_numeric(), summarize_transition().
data(production) # Basic usage decompose_factor(production, index = "firm") # With panel_data object panel <- make_panel(production, index = c("firm", "year")) decompose_factor(panel) # Selecting specific variables decompose_factor(production, select = "industry", index = "firm") # Returning results in a long format decompose_factor(production, index = "firm", format = "long") # Custom rounding decompose_factor(production, index = "firm", digits = 2) # Accessing attributes out_dec_fac <- decompose_factor(production, index = "firm") attr(out_dec_fac, "metadata") attr(out_dec_fac, "details")data(production) # Basic usage decompose_factor(production, index = "firm") # With panel_data object panel <- make_panel(production, index = c("firm", "year")) decompose_factor(panel) # Selecting specific variables decompose_factor(production, select = "industry", index = "firm") # Returning results in a long format decompose_factor(production, index = "firm", format = "long") # Custom rounding decompose_factor(production, index = "firm", digits = 2) # Accessing attributes out_dec_fac <- decompose_factor(production, index = "firm") attr(out_dec_fac, "metadata") attr(out_dec_fac, "details")
This function decomposes variance of numeric variables into between and within components in panel data.
decompose_numeric( data, select = NULL, index = NULL, detail = TRUE, format = "long", digits = 3 )decompose_numeric( data, select = NULL, index = NULL, detail = TRUE, format = "long", digits = 3 )
data |
A data.frame containing panel data in a long format. |
select |
A character vector specifying which numeric variables to analyze. If not specified, all numeric variables in the data.frame will be used. |
index |
A character vector of length 1 or 2 specifying the names of the entity and (optionally) time variables. The first element is the entity variable; if a second element is provided, it is used as the time variable. Not required if data has panel attributes. |
detail |
A logical flag indicating whether to return detailed Stata-like output. Default = TRUE. |
format |
A character string specifying the output format: "long" or "wide". Default = "long". |
digits |
An integer indicating the number of decimal places to round statistics. Default = 3. |
The output format is controlled by two parameters: format and detail.
When format = "long" and detail = TRUE (default), returns a data.frame with:
variableThe name of the analyzed variable
dimensionType of decomposition: "overall", "between", or "within"
meanMean value (only for "overall" row)
stdStandard deviation
minMinimum value
maxMaximum value
countNumber of observations or entities
When format = "long" and detail = FALSE, returns a data.frame with:
variableThe name of the variable
dimensionType of decomposition: "overall", "between", or "within"
meanMean value
stdStandard deviation
When format = "wide" and detail = TRUE, returns a data.frame with:
variableThe name of the variable
meanOverall mean
std_overallOverall standard deviation
min_overallOverall minimum
max_overallOverall maximum
count_overallNumber of observations
std_betweenBetween-entity standard deviation
min_betweenMinimum of entity means
max_betweenMaximum of entity means
count_betweenNumber of entities
std_withinWithin-entity standard deviation
min_withinWithin-entity minimum (transformed)
max_withinWithin-entity maximum (transformed)
count_withinAverage observations per entity
When format = "wide" and detail = FALSE, returns a data.frame with:
variableThe name of the variable
meanOverall mean
std_overallOverall standard deviation
std_betweenBetween-entity standard deviation
std_withinWithin-entity standard deviation
The object has class "panel_summary" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList containing additional information: count_entities.
A data.frame with panel data decomposition statistics.
For Stata users: This corresponds to the xtsum command.
See also decompose_factor(), summarize_numeric(), plot_heterogeneity().
data(production) # Basic usage decompose_numeric(production, index = "firm") # With panel_data object panel <- make_panel(production, index = c("firm", "year")) decompose_numeric(panel) # Selecting specific variables decompose_numeric(production, select = c("sales", "labor"), index = "firm") # Returning results in a wide format without excessive details decompose_numeric(production, index = "firm", detail = FALSE, format = "wide") # Custom rounding decompose_numeric(production, index = "firm", digits = 2) # Accessing attributes out_dec_num <- decompose_numeric(production, index = "firm") attr(out_dec_num, "metadata") attr(out_dec_num, "details")data(production) # Basic usage decompose_numeric(production, index = "firm") # With panel_data object panel <- make_panel(production, index = c("firm", "year")) decompose_numeric(panel) # Selecting specific variables decompose_numeric(production, select = c("sales", "labor"), index = "firm") # Returning results in a wide format without excessive details decompose_numeric(production, index = "firm", detail = FALSE, format = "wide") # Custom rounding decompose_numeric(production, index = "firm", digits = 2) # Accessing attributes out_dec_num <- decompose_numeric(production, index = "firm") attr(out_dec_num, "metadata") attr(out_dec_num, "details")
This function provides summary statistics for panel data structure with focus on balance and data completeness.
describe_balance(data, index = NULL, detail = FALSE, digits = 3)describe_balance(data, index = NULL, detail = FALSE, digits = 3)
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
detail |
A logical flag indicating whether to return additional statistics (5th, 25th, 50th, 75th, and 95th percentiles). Default = FALSE. |
digits |
An integer specifying the number of decimal places for rounding mean values. Default = 3. |
The statistics for entities describe the distribution of the number of entities observed per time period (cross‑sectional size per period). The statistics for periods describe the distribution of the number of time periods observed per entity (temporal length per entity).
The returned data.frame always contains the following columns:
dimensionEither "entities" or "periods".
meanMean number of entities per period (or periods per entity).
stdStandard deviation.
minMinimum value.
maxMaximum value.
When detail = TRUE, five additional percentile columns are included:
p55th percentile.
p2525th percentile (first quartile).
p5050th percentile (median).
p7575th percentile (third quartile).
p9595th percentile.
All statistics are rounded to the number of decimal places specified by digits.
The object has class "panel_description" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList containing the full presence matrix.
A data.frame with panel data summary statistics for entities and periods.
An entity-time combination is considered present if the corresponding row contains at least one non‑NA value in any substantive variable (all columns except the entity and time identifiers).
See also describe_dimensions(), describe_periods(), describe_patterns(), plot_periods().
data(production) # Basic usage describe_balance(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_balance(panel) # Returning detailed statisitcs describe_balance(production, index = c("firm", "year"), detail = TRUE) # Custom rounding describe_balance(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_des_bal <- describe_balance(production, index = c("firm", "year")) attr(out_des_bal, "metadata") attr(out_des_bal, "details")data(production) # Basic usage describe_balance(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_balance(panel) # Returning detailed statisitcs describe_balance(production, index = c("firm", "year"), detail = TRUE) # Custom rounding describe_balance(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_des_bal <- describe_balance(production, index = c("firm", "year")) attr(out_des_bal, "metadata") attr(out_des_bal, "details")
This function provides basic dimension counts for panel data: number of rows, unique entities, unique time periods, and substantive variables.
describe_dimensions(data, index = NULL)describe_dimensions(data, index = NULL)
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
The returned data.frame has the following structure:
rowsTotal number of rows in the data frame.
entitiesNumber of distinct values in the entity variable.
periodsNumber of distinct values in the time variable.
variablesNumber of substantive variables (all columns except entity and time).
The object has class "panel_description" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList with the actual vectors of entities, periods, and substantive variables.
A data.frame containing panel dimension counts.
See also describe_balance(), describe_periods().
data(production) # Basic usage describe_dimensions(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_dimensions(panel) # Accessing attributes out_des_dim <- describe_dimensions(production, index = c("firm", "year")) attr(out_des_dim, "metadata") attr(out_des_dim, "details")data(production) # Basic usage describe_dimensions(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_dimensions(panel) # Accessing attributes out_des_dim <- describe_dimensions(production, index = c("firm", "year")) attr(out_des_dim, "metadata") attr(out_des_dim, "details")
This function provides a descriptive table of entities with incomplete observations (missing values).
describe_incomplete(data, index = NULL, detail = FALSE)describe_incomplete(data, index = NULL, detail = FALSE)
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 1 or 2 specifying the names of the entity and (optionally) time variables. The first element is the entity variable; if a second element is provided, it is used as the time variable. Not required if data has panel attributes. |
detail |
A logical flag indicating whether to include detailed missing counts for each variable. Default = FALSE. |
The returned data.frame has the following structure:
[entity]The entity identifier (name matches input entity variable)
na_countTotal number of missing observations for the entity
variablesNumber of variables with at least one missing value for that entity
When detail = TRUE, additional columns are included for each substantive variable,
showing the number of NAs in that variable for the entity.
The data.frame is sorted by:
Number of variables with NAs (descending)
Total number of NAs (descending)
The object has class "panel_description" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList containing total entity counts and the IDs of incomplete entities.
A data.frame with incomplete entities description.
The interpretation of incomplete entities may differ depending on whether the panel is balanced or unbalanced. In a balanced panel, each entity has the same number of time periods, so the total possible observations per entity are equal. In an unbalanced panel, entities may have different numbers of time periods, so the number of missing values should be interpreted relative to the entity's total observations. The function does not adjust for the number of time periods per entity; the missing counts reflect absolute counts of NAs in the data. Users should consider the panel structure when interpreting the results.
See also summarize_missing(), describe_patterns(), describe_periods().
data(production) # Basic usage with entity only describe_incomplete(production, index = "firm") # With time variable (check duplicates) describe_incomplete(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_incomplete(panel) # Returning detailed results describe_incomplete(production, index = "firm", detail = TRUE) # Accessing attributes out_des_inc <- describe_incomplete(production, index = c("firm", "year")) attr(out_des_inc, "metadata") attr(out_des_inc, "details")data(production) # Basic usage with entity only describe_incomplete(production, index = "firm") # With time variable (check duplicates) describe_incomplete(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_incomplete(panel) # Returning detailed results describe_incomplete(production, index = "firm", detail = TRUE) # Accessing attributes out_des_inc <- describe_incomplete(production, index = c("firm", "year")) attr(out_des_inc, "metadata") attr(out_des_inc, "details")
This function describes entities presence patterns in panel data over time.
describe_patterns( data, index = NULL, delta = NULL, limits = NULL, detail = TRUE, format = "wide", digits = 3 )describe_patterns( data, index = NULL, delta = NULL, limits = NULL, detail = TRUE, format = "wide", digits = 3 )
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
delta |
An optional integer giving the expected interval between time periods. |
limits |
Either a single integer (show that many most frequent patterns) or a vector of two integers (show patterns with ranks between the two values, inclusive). If not specified, all patterns are shown. |
detail |
A logical flag indicating whether to return detailed patterns. Default = TRUE. |
format |
A character string specifying the output format: "wide" or "long". Default = "wide". |
digits |
An integer specifying the number of decimal places for rounding share column. Default = 3. |
The output format is controlled by format and detail.
When format = "wide" and detail = TRUE (default):
patternPattern number (ranked by frequency).
[time1], [time2], ...Presence (1) / absence (0) for each time period.
countNumber of entities sharing this pattern.
shareProportion of entities with this pattern (rounded to digits).
When format = "wide" and detail = FALSE, only the pattern and presence columns are returned.
When format = "long" and detail = TRUE:
patternPattern number.
[time]Time period identifier (name equals the original time variable).
presencePresence (1) / absence (0).
countNumber of entities with this pattern.
shareProportion of entities with this pattern.
When format = "long" and detail = FALSE, only pattern, time, and presence columns are returned.
Effect of delta:
If delta is supplied, the function checks that all observed time points are separated by multiples of delta.
If gaps are detected, a message lists the missing periods (unless the interval was inherited from panel attributes),
and columns for those missing periods are added to the presence matrix – and therefore to the output data.frame – with all zeros.
This ensures that the patterns reflect the full regular sequence of time periods.
The object has class "panel_description" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList with the full presence matrix, pattern‑entity mapping, and the pattern matrix.
A data.frame with presence patterns.
An entity-time combination is considered present if the corresponding row contains at least one non‑NA value in any substantive variable (i.e., all columns except the entity and time identifiers).
See also plot_patterns(), describe_periods(), describe_balance().
data(production) # Basic usage describe_patterns(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_patterns(panel) # Specifying time interval describe_patterns(production, index = c("firm", "year"), delta = 1) # Showing only the top 3 patterns describe_patterns(production, index = c("firm", "year"), limits = 3) # Showing patterns ranked 4 to 6 describe_patterns(production, index = c("firm", "year"), limits = c(4, 6)) # Returning results in a long format without excessive details describe_patterns(production, index = c("firm", "year"), detail = FALSE, format = "long") # Custom rounding describe_patterns(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_des_pat <- describe_patterns(production, index = c("firm", "year")) attr(out_des_pat, "metadata") attr(out_des_pat, "details")data(production) # Basic usage describe_patterns(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_patterns(panel) # Specifying time interval describe_patterns(production, index = c("firm", "year"), delta = 1) # Showing only the top 3 patterns describe_patterns(production, index = c("firm", "year"), limits = 3) # Showing patterns ranked 4 to 6 describe_patterns(production, index = c("firm", "year"), limits = c(4, 6)) # Returning results in a long format without excessive details describe_patterns(production, index = c("firm", "year"), detail = FALSE, format = "long") # Custom rounding describe_patterns(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_des_pat <- describe_patterns(production, index = c("firm", "year")) attr(out_des_pat, "metadata") attr(out_des_pat, "details")
This function calculates, for each time period, the number of entities that have at least one non‑missing value in any substantive variable, and the corresponding share of all entities.
describe_periods(data, index = NULL, delta = NULL, digits = 3)describe_periods(data, index = NULL, delta = NULL, digits = 3)
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
delta |
An optional integer giving the expected interval between time periods. |
digits |
An integer specifying the number of decimal places for rounding the share column. Default = 3. |
The returned data.frame contains the following columns:
[time]Time period identifier (name matches the input time variable).
countNumber of distinct entities observed in that period, i.e., entities with at least one row containing a non‑NA value in substantive variables.
shareProportion of entities observed in that period (0 to 1), rounded to digits.
Effect of delta:
If delta is supplied, the function checks that all observed time points
are separated by multiples of delta.
If gaps are detected, a message lists the missing periods
(unless the interval was inherited from panel attributes).
For each missing period, a row is added to the output with count = 0 and share = 0,
ensuring that the output covers the full regular time sequence.
The object has class "panel_description" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList with a named list entities giving, for each period, the vector of entities observed.
A data.frame with entities presence summary by time period.
See also plot_periods(), describe_balance(), describe_patterns().
data(production) # Basic usage describe_periods(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_periods(panel) # Specifying time interval describe_periods(production, index = c("firm", "year"), delta = 1) # Custom rounding describe_periods(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_des_per <- describe_periods(production, index = c("firm", "year")) attr(out_des_per, "metadata") attr(out_des_per, "details")data(production) # Basic usage describe_periods(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) describe_periods(panel) # Specifying time interval describe_periods(production, index = c("firm", "year"), delta = 1) # Custom rounding describe_periods(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_des_per <- describe_periods(production, index = c("firm", "year")) attr(out_des_per, "metadata") attr(out_des_per, "details")
This function creates a balanced panel dataset by either keeping only entities present in all time periods, keeping only periods where all entities are present, or expanding the data to include all entity-time combinations.
make_balanced(data, index = NULL, delta = NULL, balance = "rows")make_balanced(data, index = NULL, delta = NULL, balance = "rows")
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. |
delta |
An optional integer giving the expected interval between time periods. |
balance |
One of "rows", "entities", or "periods". Specifies the balancing method (see Details). Default = "rows". |
This function balances a panel dataset according to the chosen method.
The returned object has class "panel_data" and includes metadata attributes
similar to make_panel().
Balancing methods:
balance = "rows"Create a row for every entity‑time combination.
If delta is supplied, the full time sequence (including missing periods)
is used. Missing combinations get NA in all other columns.
balance = "entities"Keep only entities present in all time periods.
balance = "periods"Keep only time periods where all entities are present.
Duplicates: If duplicate entity-time combinations exist, the function stops with an error, as balancing requires a unique key.
Missing values: Rows with missing entity or time values are automatically removed before balancing.
Handling of panel_data objects: If data is a panel_data object, the
function will use the entity, time, and delta values stored in its attributes
unless overridden by explicit index or delta arguments.
A balanced panel data.frame with additional attributes.
See also make_panel(), make_wide(), make_long(), describe_dimensions(), describe_balance().
data(production) # Create a panel object first panel <- make_panel(production, index = c("firm", "year")) # Expand to full grid (default method) balanced_rows <- make_balanced(panel) # Keep only entities present in all periods balanced_entities <- make_balanced(panel, balance = "entities") # Keep only periods where all entities are present balanced_periods <- make_balanced(panel, balance = "periods") # Using a regular data frame (index must be provided) balanced_rows2 <- make_balanced(production, index = c("firm", "year")) # Specifying time interval for yearly data balanced_rows_delta <- make_balanced(production, index = c("firm", "year"), delta = 1)data(production) # Create a panel object first panel <- make_panel(production, index = c("firm", "year")) # Expand to full grid (default method) balanced_rows <- make_balanced(panel) # Keep only entities present in all periods balanced_entities <- make_balanced(panel, balance = "entities") # Keep only periods where all entities are present balanced_periods <- make_balanced(panel, balance = "periods") # Using a regular data frame (index must be provided) balanced_rows2 <- make_balanced(production, index = c("firm", "year")) # Specifying time interval for yearly data balanced_rows_delta <- make_balanced(production, index = c("firm", "year"), delta = 1)
This function performs within-group demeaning (centering) for all numeric
variables in a data frame. For each group defined by the group argument,
the group mean is subtracted from each observation. If no grouping is
provided, the overall mean is subtracted (grand mean centering). Non‑numeric
variables are not demeaned and are returned unchanged.
make_demeaned(data, group = NULL)make_demeaned(data, group = NULL)
data |
A data.frame containing the variables to be demeaned. |
group |
A character vector specifying the grouping variable(s). If not
specified and |
If group is not specified and data is not a panel_data object,
simple overall demeaning is performed: for each numeric variable, the
overall mean (ignoring NAs) is subtracted.
If group is specified, the grouping variables are used to define the
groups. Observations with NA in any grouping variable are removed before
demeaning.
Missing values in numeric variables are not removed automatically; the user should handle them prior to calling this function if desired.
If data inherits from panel_data and group is not specified, the
function automatically uses the entity and time variables stored in the
metadata attribute as grouping variables, and the returned object retains
the panel_data class and its attributes.
Non‑numeric variables are not demeaned and are returned unchanged.
Demeaning algorithms:
One group: x - mean(x | group) (exact, using ave with na.rm = TRUE).
Two or more groups: iterative Gauss–Seidel algorithm (alternating
projections). This matches the fixest fixed‑effect residuals exactly,
even for unbalanced panels. The algorithm runs up to 100 iterations with
tolerance 1e-12; a warning is issued if convergence is not reached.
The returned object has a metadata attribute and a details attribute:
metadataList containing the function name ("make_demeaned")
and the grouping variables used (group). If the input was a
panel_data object and group was not specified, the original
panel metadata (entity, time, and delta if present) are also
included.
detailsList with any additional information. If the input was a
panel_data object, the original panel details are preserved.
The input data frame with all numeric variables replaced by their
demeaned versions. Rows with missing values in the grouping variables
are removed. Missing values in numeric variables are left untouched,
and group means are computed ignoring NAs.
See also make_panel(), make_balanced(), make_wide(), make_long().
data(production) # Simple overall demeaning prod_demeaned <- make_demeaned(production) head(prod_demeaned$labor) # Demeaning by a single group (e.g., firm) prod_demeaned_firm <- make_demeaned(production, group = "firm") # Demeaning by two groups (e.g., firm and year) – matches fixest prod_demeaned_both <- make_demeaned(production, group = c("firm", "year")) # Using a panel_data object: automatically demeans by firm and year panel <- make_panel(production, index = c("firm", "year")) panel_demeaned <- make_demeaned(panel)data(production) # Simple overall demeaning prod_demeaned <- make_demeaned(production) head(prod_demeaned$labor) # Demeaning by a single group (e.g., firm) prod_demeaned_firm <- make_demeaned(production, group = "firm") # Demeaning by two groups (e.g., firm and year) – matches fixest prod_demeaned_both <- make_demeaned(production, group = c("firm", "year")) # Using a panel_data object: automatically demeans by firm and year panel <- make_panel(production, index = c("firm", "year")) panel_demeaned <- make_demeaned(panel)
This function reshapes panel data from wide format to long format, stacking time-varying columns into rows based on the pattern of column names.
make_long(data, index = NULL, spacer = "_", invert = FALSE)make_long(data, index = NULL, spacer = "_", invert = FALSE)
data |
A data.frame containing panel data in a wide format. |
index |
A character vector of length 2 specifying the name of the entity column (first element) and the name to give to the new time column in the long format (second element). |
spacer |
A character string used to separate variable names and time values in the wide column names. Default = "_". |
invert |
A logical flag indicating the order of components in column
names. If |
The function performs the following steps:
If data has panel attributes (e.g., from make_wide()) and index is
not specified, the entity column, time column name, spacer, and invert
are taken from the metadata.
Columns that do not contain the spacer (or do not match the expected
pattern when spacer = "") are treated as time‑constant and are replicated
for each time period.
Columns that match the pattern are split into variable names and time values; the set of unique time values defines the periods.
The data are reshaped to long format using stats::reshape().
The returned object has class "panel_data" and two additional attributes:
metadataList containing the function name, the entity and time
variables, the spacer, and the invert setting. If the input was a
panel_data object, the original metadata elements (delta, etc.)
are preserved.
detailsPreserved from the input if it was a panel_data object;
otherwise an empty list.
A data frame in long format, with one row per entity-time combination.
When spacer = "", the function assumes that all time‑varying columns have
a numeric suffix (if invert = FALSE) or numeric prefix (if invert = TRUE)
that represents the time period. Variable names may contain digits, but the
last contiguous block of digits is treated as the time suffix; for prefixes,
the first contiguous block of digits is treated as the time value. If a
column does not contain any digit, it is considered time‑constant.
The function assumes that all time-varying columns follow a consistent naming pattern and that every variable appears for exactly the same set of time periods (balanced in the wide sense). If some variable‑time combinations are missing, a message is printed and those variables are omitted.
See also make_panel(), make_wide(), make_balanced(), make_demeaned().
data(production) # First convert to wide, then back to long wide <- make_wide(production, index = c("firm", "year")) long <- make_long(wide) head(long) # With custom spacer and invert wide2 <- make_wide(production, index = c("firm", "year"), spacer = ".", invert = TRUE) long2 <- make_long(wide2, spacer = ".", invert = TRUE) # Using panel attributes (no need to specify index/spacer/invert) panel <- make_panel(production, index = c("firm", "year")) wide3 <- make_wide(panel) long3 <- make_long(wide3) # Using spacer = "" (no separator) wide4 <- make_wide(production, index = c("firm", "year"), spacer = "") long4 <- make_long(wide4, spacer = "")data(production) # First convert to wide, then back to long wide <- make_wide(production, index = c("firm", "year")) long <- make_long(wide) head(long) # With custom spacer and invert wide2 <- make_wide(production, index = c("firm", "year"), spacer = ".", invert = TRUE) long2 <- make_long(wide2, spacer = ".", invert = TRUE) # Using panel attributes (no need to specify index/spacer/invert) panel <- make_panel(production, index = c("firm", "year")) wide3 <- make_wide(panel) long3 <- make_long(wide3) # Using spacer = "" (no separator) wide4 <- make_wide(production, index = c("firm", "year"), spacer = "") long4 <- make_long(wide4, spacer = "")
This function adds panel structure attributes to a data.frame, storing entity and time variable names, and optionally checks the expected interval between time periods.
make_panel(data, index, delta = NULL, ...)make_panel(data, index, delta = NULL, ...)
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. |
delta |
An optional integer giving the expected interval between time periods. |
... |
Additional arguments (not used, except to catch deprecated |
This function adds attributes to a data.frame to mark it as panel data.
The returned object has class "panel_data" and includes the following attributes:
metadataList containing the function name and the arguments used
(entity, time, and delta if provided).
detailsList with diagnostic vectors:
entitiesUnique values of the entity variable.
periodsSorted unique values of the time variable.
periods_restored, periods_missing
If delta is supplied and gaps are detected,
the full sequence and missing periods.
Effect of delta:
If delta is supplied, the function checks that all observed time points are separated by multiples of delta.
If gaps are detected, a message lists the missing periods and the full sequence is stored in details$periods_restored.
The input data.frame with additional attributes.
See also make_balanced(), make_balanced(), make_wide(), make_long(), make_demeaned(),
describe_dimensions().
data(production) # Basic usage panel <- make_panel(production, index = c("firm", "year")) # Specifying time interval panel <- make_panel(production, index = c("firm", "year"), delta = 1) # Accessing attributes attr(panel, "metadata") attr(panel, "details")data(production) # Basic usage panel <- make_panel(production, index = c("firm", "year")) # Specifying time interval panel <- make_panel(production, index = c("firm", "year"), delta = 1) # Accessing attributes attr(panel, "metadata") attr(panel, "details")
This function reshapes panel data from long format to wide format, creating separate columns for each time period.
make_wide(data, index = NULL, spacer = "_", invert = FALSE)make_wide(data, index = NULL, spacer = "_", invert = FALSE)
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. |
spacer |
A character string to insert between variable names and time values in the wide format column names. Default = "_". |
invert |
A logical flag indicating whether to put time values before
variable names in column names. If |
The function performs the following steps:
If data has panel attributes and index is not specified, the entity
and time variables are taken from the metadata.
Rows with missing values in entity or time variables are removed.
Duplicate entity‑time combinations are detected and reported (unless they originate from panel attributes).
The data are reshaped to wide format using stats::reshape().
The returned object has class "panel_data" and two additional attributes:
metadataList containing the function name, the entity and time
variables, the spacer, and the invert setting. If the input was a
panel_data object, the original metadata elements (delta, etc.)
are preserved.
detailsPreserved from the input if it was a panel_data object;
otherwise an empty list.
A data frame in wide format, with one row per entity.
The function works for standard atomic types (logical, integer, double,
complex, character, raw) and for factors. However, non‑standard column types
such as Date, POSIXct, or custom S3/S4 classes may lose their special
attributes during reshaping. Duplicate entity-time combinations must be
resolved beforehand; the function will issue a message but does not aggregate.
See also make_panel(), make_long(), make_balanced(), make_demeaned().
data(production) # Basic conversion wide <- make_wide(production, index = c("firm", "year")) head(wide) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) wide2 <- make_wide(panel) # Custom spacer and inverted order wide3 <- make_wide(production, index = c("firm", "year"), spacer = ".", invert = TRUE) names(wide3)data(production) # Basic conversion wide <- make_wide(production, index = c("firm", "year")) head(wide) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) wide2 <- make_wide(panel) # Custom spacer and inverted order wide3 <- make_wide(production, index = c("firm", "year"), spacer = ".", invert = TRUE) names(wide3)
This function creates visualizations of heterogeneity among groups.
plot_heterogeneity(data, select, group = NULL, colors = c("darkblue", "gray"))plot_heterogeneity(data, select, group = NULL, colors = c("darkblue", "gray"))
data |
A data.frame containing variables for analysis. |
select |
A character string specifying the numeric variable of interest. |
group |
A character string or vector of character strings specifying the grouping variable(s). If data has panel attributes and group is not specified, both the entity and time variables will be used as grouping variables. |
colors |
A character vector of two colors: first for mean line and points, second for individual points. Default = c("darkblue", "gray"). |
This function creates one or more plots (depending on the number of grouping variables) showing the heterogeneity among groups. Each plot displays individual observations (points) and group means (connected line).
The returned list contains the following components:
metadataList containing the function name, selection, group, and colors.
detailsList containing group-level statistics for each grouping variable, each containing means, standard deviations, and counts per group.
Invisibly returns a list with summary statistics and metadata.
See also decompose_numeric(), summarize_numeric().
data(production) # Basic usage with regular data.frame plot_heterogeneity(production, select = "labor", group = "year") # Using multiple grouping variables plot_heterogeneity(production, select = "sales", group = c("firm", "industry", "year")) # With panel_data object (uses both entity and time) panel <- make_panel(production, index = c("firm", "year")) plot_heterogeneity(panel, select = "labor") # Custom colors plot_heterogeneity(production, select = "sales", group = "year", colors = c("black", "gray")) # Accessing list components out_plo_het <- plot_heterogeneity(panel, select = "capital", group = "year") out_plo_het$metadata out_plo_het$detailsdata(production) # Basic usage with regular data.frame plot_heterogeneity(production, select = "labor", group = "year") # Using multiple grouping variables plot_heterogeneity(production, select = "sales", group = c("firm", "industry", "year")) # With panel_data object (uses both entity and time) panel <- make_panel(production, index = c("firm", "year")) plot_heterogeneity(panel, select = "labor") # Custom colors plot_heterogeneity(production, select = "sales", group = "year", colors = c("black", "gray")) # Accessing list components out_plo_het <- plot_heterogeneity(panel, select = "capital", group = "year") out_plo_het$metadata out_plo_het$details
This function creates a heatmap showing the number of missing values for each variable across all time periods in panel data.
plot_missing(data, select = NULL, index = NULL, colors = c("darkblue", "gray"))plot_missing(data, select = NULL, index = NULL, colors = c("darkblue", "gray"))
data |
A data.frame containing panel data in a long format. |
select |
A character vector specifying which variables to include. If not specified, all substantive variables (except entity and time) are used. |
index |
A character vector of length 2 giving the names of the entity and time variables. Not required if data has panel attributes. |
colors |
A character vector of two colors defining the gradient for the heatmap. The first color represents the largest number of missing values, the second color the smallest number. Default = c("darkblue", "gray"). |
The function creates a heatmap where rows are variables and columns are time periods.
Cell color reflects the number of missing values in that variable for that period,
using a continuous gradient from colors[1] (most missing) to colors[2] (least missing).
Rows are ordered as the variables appear (first at the top). Columns are ordered chronologically.
The returned list contains:
metadataList containing the function call, select, entity/time variables, and colors.
detailsList with the missing count matrix (variables × periods).
Invisibly returns a list with summary statistics and metadata.
The interpretation of missing counts may differ depending on whether the panel is balanced or unbalanced. In a balanced panel, each time period contains the same number of entities, so the raw NA counts per period are directly comparable across periods. In an unbalanced panel, the number of entities varies by period, so the raw NA counts should be interpreted relative to the number of observations available in each period. The function does not standardize the counts by period size; users should account for the panel structure when interpreting the results.
See also summarize_missing, plot_patterns(), plot_periods().
data(production) # Basic usage plot_missing(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) plot_missing(panel) # Selecting specific variables plot_missing(production, select = c("labor", "capital"), index = c("firm", "year")) # Custom colors plot_missing(production, index = c("firm", "year"), colors = c("black", "white")) # Access the returned list out_plo_mis <- plot_missing(production, index = c("firm", "year")) out_plo_mis$metadata out_plo_mis$detailsdata(production) # Basic usage plot_missing(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) plot_missing(panel) # Selecting specific variables plot_missing(production, select = c("labor", "capital"), index = c("firm", "year")) # Custom colors plot_missing(production, index = c("firm", "year"), colors = c("black", "white")) # Access the returned list out_plo_mis <- plot_missing(production, index = c("firm", "year")) out_plo_mis$metadata out_plo_mis$details
This function creates a heatmap showing the presence/absence pattern of each entity over time.
plot_patterns( data, index = NULL, delta = NULL, limits = NULL, colors = c("darkblue", "white") )plot_patterns( data, index = NULL, delta = NULL, limits = NULL, colors = c("darkblue", "white") )
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
delta |
An optional integer giving the expected interval between time periods. |
limits |
Either a single integer (show that many most frequent patterns) or a vector of two integers (show patterns with ranks between the two values, inclusive). If not specified, all patterns are shown. |
colors |
A character vector of two colors for present and missing observations. Default = c("darkblue", "white"). |
The function creates a heatmap where rows are entities and columns are time periods. Present cells are colored with the first color, missing cells with the second. Rows are ordered by pattern frequency: the most frequent pattern is at the top. Within each pattern block, entities appear in their original order.
Effect of delta:
If delta is supplied, the function checks for regular spacing and adds missing periods
(with all zeros) to the plot.
A message lists missing periods unless the interval was inherited from panel attributes.
The heatmap will therefore show columns for the full regular time sequence,
with missing periods appearing entirely white (or the color for missing).
The returned list contains:
metadataList containing the function name and the arguments used.
detailsList with the sorted presence matrix, pattern‑entity mapping, pattern count, and the pattern matrix (unique patterns as rows).
Invisibly returns a list with summary statistics and metadata.
An entity-time combination is considered present if the corresponding row contains at least one non‑NA value in any substantive variable (all columns except the entity and time identifiers).
See also describe_patterns(), plot_periods(), plot_missing().
data(production) # Basic usage plot_patterns(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) plot_patterns(panel) # Specifying time interval plot_patterns(production, index = c("firm", "year"), delta = 1) # Show only the top 3 patterns plot_patterns(production, index = c("firm", "year"), limits = 3) # Show patterns ranked 4 to 6 plot_patterns(production, index = c("firm", "year"), limits = c(4, 6)) # Custom colors plot_patterns(production, index = c("firm", "year"), colors = c("black", "white")) # Accessing list components out_plo_pat <- plot_patterns(production, index = c("firm", "year")) out_plo_pat$metadata out_plo_pat$detailsdata(production) # Basic usage plot_patterns(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) plot_patterns(panel) # Specifying time interval plot_patterns(production, index = c("firm", "year"), delta = 1) # Show only the top 3 patterns plot_patterns(production, index = c("firm", "year"), limits = 3) # Show patterns ranked 4 to 6 plot_patterns(production, index = c("firm", "year"), limits = c(4, 6)) # Custom colors plot_patterns(production, index = c("firm", "year"), colors = c("black", "white")) # Accessing list components out_plo_pat <- plot_patterns(production, index = c("firm", "year")) out_plo_pat$metadata out_plo_pat$details
This function calculates summary statistics and creates a histogram showing the distribution of time periods covered by each entity in panel data.
plot_periods(data, index = NULL, colors = c("darkblue", "white"))plot_periods(data, index = NULL, colors = c("darkblue", "white"))
data |
A data.frame containing panel data in a long format. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
colors |
A character vector of length 2 specifying the fill color and line color for the histogram. First color is for fill, second color is for the border line. Default = c("darkblue", "white"). |
The function creates a histogram of the number of time periods covered by each entity. The x‑axis shows coverage (periods per entity), the y‑axis shows the count of entities.
The returned list contains:
metadataList containing the function name and the arguments used.
detailsList with the coverage vector per entity and the histogram data used for plotting.
Invisibly returns a list with summary statistics and metadata.
An entity-time combination is considered present if the corresponding row contains at least one non‑NA value in any substantive variable (all columns except the entity and time identifiers).
See also describe_periods(), plot_patterns(), plot_missing().
data(production) # Basic usage plot_periods(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) plot_periods(panel) # Custom colors plot_periods(production, index = c("firm", "year"), colors = c("gray", "black")) # Accessing list components out_plo_per <- plot_periods(production, index = c("firm", "year")) out_plo_per$metadata out_plo_per$detailsdata(production) # Basic usage plot_periods(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) plot_periods(panel) # Custom colors plot_periods(production, index = c("firm", "year"), colors = c("gray", "black")) # Accessing list components out_plo_per <- plot_periods(production, index = c("firm", "year")) out_plo_per$metadata out_plo_per$details
A simulated dataset containing firm-level panel data with industry affiliation, entry, exit, random missing values, and ownership information. The data follows industry-specific production structures with occasional industry and ownership changes.
productionproduction
A data frame with 180 rows (30 firms × 6 years) and 7 variables:
integer; firm identifier (1 to 30)
integer; year identifier (1 to 6)
numeric; firm sales/output generated from a Cobb-Douglas production function with industry-specific parameters and technology shocks. Contains random missing values (~2%).
numeric; capital input, log‑normally distributed with firm-specific effects and industry-specific time trends. Contains random missing values (~2%).
numeric; labor input, log‑normally distributed with firm-specific effects and industry-specific time trends. Contains random missing values (~2%).
factor; industry affiliation with three levels: "Industry 1", "Industry 2", "Industry 3". Some firms change industry over time.
factor; ownership type with three levels: "private", "public", "mixed". The variable is stable over time but changes with a probability of 5% per year.
The dataset exhibits several realistic features of firm-level panel data:
50% of firms (15 firms) have complete data for all 6 years.
50% of firms (15 firms) have entry and exit patterns with different start and end years.
Three industry categories with different production function parameters.
About 20% of firms change industry affiliation at least once.
Ownership changes occur with 5% probability per year.
Industry-specific Cobb‑Douglas parameters:
Industry 1: , , (labor‑intensive)
Industry 2: , , (balanced, high productivity)
Industry 3: , , (standard)
Additional random missing values (approx. 2%) in sales, capital, and labor.
Firm-specific effects and industry-specific time trends in inputs.
Technology shocks affecting output.
Simulated data for econometric analysis and demonstration purposes.
data(production) head(production) table(production$ownership)data(production) head(production) table(production$ownership)
This function calculates summary statistics for missing values (NAs) in panel data, providing both overall and detailed period-specific missing value counts.
summarize_missing( data, select = NULL, index = NULL, detail = FALSE, digits = 3 )summarize_missing( data, select = NULL, index = NULL, detail = FALSE, digits = 3 )
data |
A data.frame containing panel data in a long format. |
select |
A character vector specifying which variables to analyze for missing values. If not specified, all variables (except entity and time) will be used. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
detail |
A logical flag indicating whether to return detailed period-specific NA counts. Default = FALSE. |
digits |
An integer indicating the number of decimal places to round the share column. Default = 3. |
When detail = FALSE, returns columns:
variableVariable name.
na_countTotal number of missing values in that variable.
na_shareProportion of missing values (rounded to digits).
entitiesNumber of distinct entities that have at least one missing value in that variable.
periodsNumber of distinct time periods that have at least one missing value in that variable.
When detail = TRUE, additional columns for each time period contain the number of missing values
in that variable for that period.
The object has class "panel_summary" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList with counts of variables with/without NAs, and their names.
A data.frame with missing value summary statistics.
The interpretation of missing counts may differ depending on whether the panel is balanced or unbalanced. In a balanced panel, each time period contains the same number of entities, so the raw NA counts per period are directly comparable across periods. In an unbalanced panel, the number of entities varies by period, so the raw NA counts should be interpreted relative to the number of observations available in each period. The function does not standardize the counts by period size; users should account for the panel structure when interpreting the results.
See also plot_missing(), describe_incomplete(), describe_patterns(), describe_periods().
data(production) # Basic usage summarize_missing(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) summarize_missing(panel) # Selecting specific variables summarize_missing(production, select = c("labor", "capital"), index = c("firm", "year")) # Returning detailed results summarize_missing(production, index = c("firm", "year"), detail = TRUE) # Custom rounding summarize_missing(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_sum_mis <- summarize_missing(production, index = c("firm", "year")) attr(out_sum_mis, "metadata") attr(out_sum_mis, "details")data(production) # Basic usage summarize_missing(production, index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) summarize_missing(panel) # Selecting specific variables summarize_missing(production, select = c("labor", "capital"), index = c("firm", "year")) # Returning detailed results summarize_missing(production, index = c("firm", "year"), detail = TRUE) # Custom rounding summarize_missing(production, index = c("firm", "year"), digits = 2) # Accessing attributes out_sum_mis <- summarize_missing(production, index = c("firm", "year")) attr(out_sum_mis, "metadata") attr(out_sum_mis, "details")
This function calculates summary statistics for numeric variables, either overall or grouped by a single grouping variable.
summarize_numeric( data, select = NULL, group = NULL, detail = FALSE, digits = 3 )summarize_numeric( data, select = NULL, group = NULL, detail = FALSE, digits = 3 )
data |
A data.frame containing variables for analysis. |
select |
A character vector specifying which numeric variables to analyze. If not specified, all numeric variables in the data.frame will be used. |
group |
A character string specifying the grouping variable name. If not specified, overall statistics will be returned. |
detail |
A logical flag indicating whether to return additional statistics (25th, 50th, and 75th percentiles). Default = FALSE. |
digits |
An integer specifying the number of decimal places for rounding statistics. Default = 3. |
The returned data.frame contains columns depending on the arguments:
When no grouping variable is specified (overall):
variableThe name of the numeric variable.
countNumber of non‑NA observations.
meanArithmetic mean.
stdStandard deviation.
minMinimum value.
maxMaximum value.
When detail = TRUE, additional columns are included:
p2525th percentile (first quartile).
p5050th percentile (median).
p7575th percentile (third quartile).
When a grouping variable is specified, statistics are calculated for each group, and the data.frame includes a column named after the grouping variable, followed by the same statistics columns as above.
The object has class "panel_summary" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList with counts of variables, groups, and total observations.
A data.frame with descriptive statistics summary.
See also decompose_numeric(), plot_heterogeneity().
data(production) # Basic usage summarize_numeric(production) # Selecting specific variables summarize_numeric(production, select = "sales") summarize_numeric(production, select = c("capital", "labor")) # Grouped statistics summarize_numeric(production, group = "year") # Detailed statistics summarize_numeric(production, detail = TRUE) # Custom rounding summarize_numeric(production, digits = 2) # Accessing attributes out_sum_num <- summarize_numeric(production) attr(out_sum_num, "metadata") attr(out_sum_num, "details")data(production) # Basic usage summarize_numeric(production) # Selecting specific variables summarize_numeric(production, select = "sales") summarize_numeric(production, select = c("capital", "labor")) # Grouped statistics summarize_numeric(production, group = "year") # Detailed statistics summarize_numeric(production, detail = TRUE) # Custom rounding summarize_numeric(production, digits = 2) # Accessing attributes out_sum_num <- summarize_numeric(production) attr(out_sum_num, "metadata") attr(out_sum_num, "details")
Calculates transition counts and shares between states of a categorical (factor) variable across consecutive time periods within entities for panel data.
summarize_transition(data, select, index = NULL, format = "wide", digits = 3)summarize_transition(data, select, index = NULL, format = "wide", digits = 3)
data |
A data.frame containing panel data in a long format. |
select |
A character string specifying the factor variable to analyze transitions for. |
index |
A character vector of length 2 specifying the names of the entity and time variables. Not required if data has panel attributes. |
format |
A character string specifying the output format: |
digits |
An integer indicating the number of decimal places to round transition shares. Default = 3. |
The structure depends on format:
When format = "wide", a transition matrix as a data.frame:
from_toThe originating state (row label).
[state1], [state2], ...Columns for each destination state, containing the share of transitions from the row state to the column state (rounded).
When format = "long", a data.frame with columns:
fromOriginating state.
toDestination state.
countNumber of observed transitions.
shareProportion of transitions from from that go to to (rounded).
The object has class "panel_summary" and two additional attributes:
metadataList containing the function name and the arguments used.
detailsList with the vector of all category levels.
A data.frame containing transition summaries.
See also decompose_factor().
data(production) # Basic usage summarize_transition(production, select = "industry", index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) summarize_transition(panel, select = "industry") # Returning results in a long format summarize_transition(production, select = "industry", index = c("firm", "year"), format = "long") # Custom rounding summarize_transition(production, select = "industry", index = c("firm", "year"), digits = 2) # Accessing attributes out_sum_tra <- summarize_transition(production, select = "industry", index = c("firm", "year")) attr(out_sum_tra, "metadata") attr(out_sum_tra, "details")data(production) # Basic usage summarize_transition(production, select = "industry", index = c("firm", "year")) # With panel_data object panel <- make_panel(production, index = c("firm", "year")) summarize_transition(panel, select = "industry") # Returning results in a long format summarize_transition(production, select = "industry", index = c("firm", "year"), format = "long") # Custom rounding summarize_transition(production, select = "industry", index = c("firm", "year"), digits = 2) # Accessing attributes out_sum_tra <- summarize_transition(production, select = "industry", index = c("firm", "year")) attr(out_sum_tra, "metadata") attr(out_sum_tra, "details")