genproc() now integrates with the
progressr framework. When the calling code is wrapped in
progressr::with_progress(...), one progression signal is
emitted per completed case (in sequential and parallel modes; signals
from worker subprocesses are propagated by future.apply).
The user picks any handler (text bar, RStudio gadget, beeps, custom) via
progressr::handlers(). Without
with_progress(), the integration is a complete no-op.
progressr is in Suggests; the integration is
skipped when it is not installed. Live monitoring of non-blocking runs
is on the roadmap.errors(result) returns the failed-case rows of the
log with all original columns (case_id, mask params, error_message,
traceback, duration_secs). Replaces the boilerplate
result$log[!result$log$success, ] pattern.summary(result) (S3 method on
genproc_result) produces a compact human-readable digest:
status, success rate, per-case duration stats (mean, max, slowest
case_id), and the top recurring error messages by occurrence
(configurable via top_errors). Useful on runs with many
cases where the raw log is too noisy to eyeball.rerun_failed(r0, f) helper. Sibling of
rerun_affected(): filters the original mask down to the
cases that failed and re-runs genproc() on that subset
only. Useful after fixing the cause of a transient failure.rerun_affected(r0, diff, f) helper. Closes the
reproducibility loop: when [diff_inputs()] reports drift between two
runs, rerun_affected() filters the original mask down to
the cases that referenced the impacted files and re-runs
genproc() on that subset only. The resulting
genproc_result is a small refresh, not a full re-run.diff_inputs() now returns a new
$cases_affected field: a data.frame with columns
case_id, path, column,
change_type listing every (case, input column) pair
impacted by the diff. Available both programmatically and as input to
rerun_affected(). The print method also shows a concise
summary (“Cases affected: N”) and a hint towards
rerun_affected().print.genproc_input_diff now distinguishes small size
variations whose human-readable rounding is identical: when the
formatted size is the same on both sides, the byte delta is shown
explicitly (size: 1.1 KB -> 1.1 KB (+6 B)).result$reproducibility$parallel now carries an
effective_strategy field alongside the user-requested
strategy. The two differ when the user passed
workers without an explicit strategy, in which
case genproc() auto-defaults to
"multisession"; the snapshot now records both, preserving
the audit trail of what was requested vs what was applied. The
Mode line of print(result) now shows the
effective strategy by default, so a sequential vs parallel multisession
run is no longer ambiguous in the printed summary.
status() now distinguishes "done" (the
wrapper future resolved successfully) from "error" (the
wrapper crashed), even before [await()] is called. Previously
status() returned "done" as soon as the future
was resolved, regardless of outcome — leading to the misleading
Status: done (not collected) print on a job that had
actually failed. The peek result is cached in a shared environment so
that a subsequent await() does not re-materialize the
future.
print(result) is more informative: a
Started line shows the run’s timestamp, a Mode
line summarises the execution configuration (sequential,
multisession parallel (4 workers),
non-blocking + multisession parallel (6 workers), etc.),
and the method emits errors(x) / summary(x)
hints when failures occurred. The non-blocking print also distinguishes
done (not collected) from
error (not collected).
When parallel was used but startup overhead clearly
dominated the run, print(result) now emits a
Note warning. Two metrics: parallel efficiency below 50%
when workers is supplied (catches cases like
parallel_spec(workers = 4) that yield no real speedup), or
wall-clock above cumulative * 1.2 in power-user mode
(workers unknown). Both require wall > 0.5s to avoid noise. Addresses
the common surprise of activating parallel on a small workload and
observing a slowdown.
Tracebacks captured by the logged layer are now substantially
shorter and easier to read. Internal dispatcher frames
(execute_cases, do.call, FUN),
invocation context frames (source, eval,
withVisible), and PSOCK worker frames
(workRSOCK, workLoop,
workCommand, makeSOCKmaster) are now dropped
from the head of the stack, so the first surviving frame is always user
code. User calls to lapply() or do.call() from
within their own function are preserved (the head-position filter only
consumes leading frames).
Composing parallel = parallel_spec(...) and
nonblocking = nonblocking_spec(...) now works out of the
box on Windows and in RStudio configurations where the wrapper
subprocess inherits getOption("mc.cores") set to 1.
Previously, the composed call failed with a parallelly
“only 1 CPU cores available” error, and (less visibly) emitted a
misleading soft-limit warning. genproc() now applies two
surgical adjustments inside the wrapper subprocess in the composed case
(only when the user has not set their own values): it sets
R_PARALLELLY_AVAILABLECORES_METHODS = "system" to lift the
hard limit, and raises options(mc.cores) to silence the
soft-limit warning. The calling session is never modified.
First public release. The package consolidates the four execution
layers (logged, reproducibility, parallel, non-blocking) and the
building blocks (from_example_to_function(),
from_function_to_mask(),
rename_function_params(),
add_trycatch_logrow()) under a stable API contract. The
genproc_result S3 class fields are guaranteed
forward-compatible across the 0.x series.
genproc() runs a function over an iteration mask,
with two mandatory layers always active:
withCallingHandlers()) and per-case
timing.parallel_spec() and the parallel
argument of genproc(): optional parallel dispatch over
future.apply::future_lapply(). Auto-defaults to
"multisession" when workers is passed without
an explicit strategy, restoring the previous plan on
exit.nonblocking_spec() and the nonblocking
argument of genproc(): genproc() returns
immediately with a genproc_result of status
"running" while the run continues in a background future.
Use status() to poll, await() to block until
resolution. Composable with parallel.result$reproducibility$inputs as
(method, files, refs). Heuristic detection by default;
explicit override via genproc(..., input_cols = ...) or
skip_input_cols = .... Disable with
track_inputs = FALSE.diff_inputs(r0, r1) compares the input fingerprints
of two runs and reports changed / unchanged / added / removed files,
with a human-readable print method.genproc_result with stable fields:
log, reproducibility, n_success,
n_error, duration_total_secs,
status.log and surfaced in n_error.case_ids are index-based (case_0001, …)
for now; a content-based variant is planned.from_example_to_function(): turn an example expression
that works for one case into a parameterized function. String literals
and free symbols become parameters with the original value as default.
Built on a dependency-free AST rewriter.from_function_to_mask(): derive a one-row template
data.frame from a function’s signature, ready to be
expanded into a full iteration mask.rename_function_params(): rename parameters in formals
and body in one pass, without editing the function source.add_trycatch_logrow(): the standalone logging wrapper
used by genproc(), exposed for users who want the logged
layer outside the full pipeline.