Fill NAs in R with zero if the next valid data point is more than 2 intervals awayConvert a list of data...
If I wouldn't want to read the story, is writing it still a good idea?
Cascading Repair Costs following Blown Head Gasket on a 2004 Subaru Outback
Impossible darts scores
Does Marvel have an equivalent of the Green Lantern?
Why did pressing the joystick button spit out keypresses?
Unusual mail headers, evidence of an attempted attack. Have I been pwned?
Why is C++ initial allocation so much larger than C's?
Is my Rep in Stack-Exchange Form?
Computing a trigonometric integral
How to split an equation in two lines?
What reason would an alien civilization have for building a Dyson Sphere (or Swarm) if cheap Nuclear fusion is available?
Why is the voltage measurement of this circuit different when the switch is on?
Can any NP-Complete Problem be solved using at most polynomial space (but while using exponential time?)
Where can I find a database of galactic spectra?
What is the legal status of travelling with methadone in your carry-on?
Iterate MapThread with matrices
Why do some games show lights shine thorugh walls?
Should I prioritize my 401(k) over my student loans?
First-year PhD giving a talk among well-established researchers in the field
Is adding a new player (or players) a DM decision, or a group decision?
Interaction between Leyline of Anticipation and Teferi, Time Raveler
Underbar nabla symbol doesn't work
Trainee keeps missing deadlines for independent learning
Hand soldering SMD 1206 components
Fill NAs in R with zero if the next valid data point is more than 2 intervals away
Convert a list of data frames into one data frameHow can I use mySQL replace() to replace strings in multiple records?How to rename a single column in a data.frame?grep using a character vector with multiple patternsMatching a fixed point with an interval in a data framedata.table vs dplyr: can one do something well the other can't or does poorly?Calculate the bearing between more than two data pointsFind interval from data frame start and end pointsCreate intervals using data pointsPlotting splines with confidence intervals on top of data points
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
I have multiple vectors with NAs and my intention to fill NA which are more than 2 intervals from a valid data point with 0. for example:
x <- c(3, 4, NA, NA, NA, 3, 3)
Expected output is,
3, 4, NA, 0, NA, 3, 3
r replace na
add a comment |
I have multiple vectors with NAs and my intention to fill NA which are more than 2 intervals from a valid data point with 0. for example:
x <- c(3, 4, NA, NA, NA, 3, 3)
Expected output is,
3, 4, NA, 0, NA, 3, 3
r replace na
add a comment |
I have multiple vectors with NAs and my intention to fill NA which are more than 2 intervals from a valid data point with 0. for example:
x <- c(3, 4, NA, NA, NA, 3, 3)
Expected output is,
3, 4, NA, 0, NA, 3, 3
r replace na
I have multiple vectors with NAs and my intention to fill NA which are more than 2 intervals from a valid data point with 0. for example:
x <- c(3, 4, NA, NA, NA, 3, 3)
Expected output is,
3, 4, NA, 0, NA, 3, 3
r replace na
r replace na
edited 9 hours ago
markus
17.6k2 gold badges16 silver badges39 bronze badges
17.6k2 gold badges16 silver badges39 bronze badges
asked 9 hours ago
John MusauJohn Musau
536 bronze badges
536 bronze badges
add a comment |
add a comment |
6 Answers
6
active
oldest
votes
Maybe there are simpler solutions but this one works.
na2zero <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
na2zero(x)
#[1] 3 4 NA 0 NA 3 3
X <- list(x, c(x, x), c(3, 4, NA, NA, NA, NA, 3, 3))
lapply(X, na2zero)
add a comment |
Update -
Here's probably one of the simplest and fastest solutions (Thanks to answer from G. Grothendieck). Simply knowing whether the value is NA
on either side of any NA
is sufficient information. Therefore, using lead
and lag
from dplyr
package -
na2zero <- function(x) {
x[is.na(lag(x, 1, 0)) & is.na(lead(x, 1, 0)) & is.na(x)] <- 0
x
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L)))
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
Previous Answer (also fast) -
Here's one way using rle
and replace
from base R. This method turns every NA
, that is not an endpoint in the running length, into a 0
-
na2zero <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
Updated Benchmarks -
set.seed(2)
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
microbenchmark(
Rui(x),
Shree_old(x), Shree_new(x),
markus(x),
IceCreamT(x),
Uwe1(x), Uwe2(x), Uwe_Reduce(x),
Grothendieck(x),
times = 50
)
all.equal(Shree_new(x), Rui(x)) # [1] TRUE
all.equal(Shree_new(x), Shree_old(x)) # [1] TRUE
all.equal(Shree_new(x), markus(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe1(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe2(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe_Reduce(x)) # [1] TRUE
all.equal(Shree_new(x), Grothendieck(x)) # [1] TRUE
Unit: milliseconds
expr min lq mean median uq max neval
Rui(x) 286.026540 307.586604 342.620266 318.404731 363.844258 518.03330 50
Shree_old(x) 51.556489 62.038875 85.348031 65.012384 81.882141 327.57514 50
Shree_new(x) 3.996918 4.258248 17.210709 6.298946 10.335142 207.14732 50
markus(x) 853.513854 885.419719 1001.450726 919.930389 1018.353847 1642.25435 50
IceCreamT(x) 12.162079 13.773873 22.555446 15.021700 21.271498 199.08993 50
Uwe1(x) 162.536980 183.566490 225.801038 196.882049 269.020395 439.17737 50
Uwe2(x) 83.582360 93.136277 115.608342 99.165997 115.376903 309.67290 50
Uwe_Reduce(x) 1.732195 1.871940 4.215195 2.016815 4.842883 25.91542 50
Grothendieck(x) 620.814291 688.107779 767.749387 746.699435 850.442643 982.49094 50
While this works for the provided example, I don't know that it gives the desired behavior for say,x <- c(3, 4, NA, NA, NA, NA, 3, 3)
.
– Mako212
9 hours ago
@Mako212 It gives[1] 3 4 NA NA 0 NA 3 3
which seems correct. Am I missing something?
– Shree
9 hours ago
1
Not sure, the way I read OP's question I'd expect[1] 3 4 NA 0 0 NA 3 3
to be the desired result, but that's just how I interpreted it.
– Mako212
9 hours ago
Nah, the first 0 in your output is > 2 away from nearest valid point.
– Shree
9 hours ago
1
The question saysmore than 2 intervals from a valid data point
– Mako212
9 hours ago
|
show 1 more comment
Here's a data.table option
library(data.table)
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Benchmark and equality comparison.
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
all.equal(Rui(x), na0_dt(x))
# [1] TRUE
all.equal(Rui(x), markus(x))
# [1] TRUE
all.equal(Rui(x), Shree(x))
# [1] TRUE
all.equal(Rui(x), Uwe1(x))
# [1] TRUE
all.equal(Rui(x), Uwe2(x))
# [1] TRUE
library(microbenchmark)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# Rui(x) 262.80052 286.75909 323.24804 300.73305 344.75091 488.45805 50 d
# Shree(x) 55.30956 64.63756 82.52067 72.70157 85.11101 367.10743 50 b
# markus(x) 807.05592 882.74498 1007.31244 958.71506 1075.13111 1590.34609 50 e
# na0_dt(x) 14.95426 16.75980 19.37040 18.53108 19.99180 38.43161 50 a
# Uwe1(x) 164.20027 178.59412 210.47365 198.76931 224.98389 327.70639 50 c
# Uwe2(x) 85.33296 97.42117 110.56380 102.55040 115.19759 192.73526 50 b
Again with length 1e3
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e3, T)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: microseconds
# expr min lq mean median uq max neval cld
# Rui(x) 2543.180 2668.719 3084.0056 2847.386 3286.566 4820.105 50 bc
# Shree(x) 530.462 584.206 651.1348 600.821 653.949 1170.052 50 a
# markus(x) 7036.721 7604.517 9467.0319 8507.491 11014.979 24400.421 50 d
# na0_dt(x) 1008.001 1092.513 1179.9802 1160.001 1247.591 1564.308 50 a
# Uwe1(x) 2907.078 3190.976 3773.5566 3424.207 4162.873 9388.723 50 c
# Uwe2(x) 2299.489 2566.155 2803.8250 2703.386 2996.104 4038.155 50 b
Functions from other answers used in this benchmark
Rui <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
Shree <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
markus <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Uwe1 <- function(x){
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
}
Uwe2 <- function(x){
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
}
1
Interestingly, my base R solution seems faster forx
of length1e3
. I have no idea howdata.table
works but seems like it has some initial overhead after which it just takes off! Therefore longer the vector more it can compensate.
– Shree
7 hours ago
1
That's interesting, I added that case to the answer. I thought maybe the flipping-of-speeds would be because of the[.data.table
overhead but it persists even after rewriting my function to avoid[.data.table
– IceCreamToucan
7 hours ago
Your benchmark makes me feel I should downvote my answer. Tried it, but it's not possibly unfortunately.
– markus
7 hours ago
Ha, that wasn't my intention. Even for a vector of length 1e5 the times are listed in milliseconds, so the timing isn't all that important for most cases probably.
– IceCreamToucan
7 hours ago
add a comment |
For the sake of completeness, here are three other data.table approaches:
x <- c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L))
library(data.table)
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
shift()
& Reduce()
I was so focused on finding the right way to create groups that I started to think about the straightforward approach rather late. The rule is quite simple:
Replace all NAs by zero which are preceeded and succeeded by another NA.
This can be accomplished by zoo::rollapply()
as in G. Grothendieck's answer or by using lag()
& lead()
like in Shree's latest edit.
However, my own benchmark (not posted here to avoid duplication with Shree' benchmark) shows that data.table::shift()
and Reduce()
is the fastest method so far.
isnax <- is.na(x)
x[Reduce(`&`, data.table::shift(isnax, -1:1))] <- 0
x
It is also slightly faster than using lag()
& lead()
(please, note that this differs from Shree's version as is.na()
is only called once):
isnax <- is.na(x)
x[isnax & dplyr::lag(isnax) & dplyr::lead(isnax)] <- 0
x
Indeed, it is faster. Nice! I'll add it to my benchmarks.
– Shree
4 hours ago
2
@Shree, thank you for benchmarking all the different solutions. BTW: I have switched frommicrobenchmark
tobench
for doing benchmarks because it allows to vary problem sizes easily and to create charts. (I decided not to post the charts because you took the burden to do all the benchmarks, already.)
– Uwe
4 hours ago
add a comment |
Based on the example, I assume what you mean is that if the value is NA and adjacent values in both directions are NA (or in one direction if the value is first or last) then replace the value with 0. Using a centered rolling window of length 3 return TRUE if it is all NA and then replace the TRUE positions with 0. This gives the following one-liner
library(zoo)
replace(x, rollapply(c(TRUE, is.na(x), TRUE), 3, all), 0)
## [1] 3 4 NA 0 NA 3 3
zoo
has some pretty handy functions to work with missing values
– PavoDive
7 hours ago
add a comment |
Another base approach
x <- c(3, 4, NA, NA, NA, 3, 3, NA, 3, NA, NA, NA, NA, 1)
Create a grouping variable
grp <- with(rle(is.na(x)), rep(seq_along(lengths), lengths)) # same as rleid(is.na(x))
For each group calculate the parallel minimum of cumsum(is.na(x))
and its reverse (which will be greater than one for values 'which are more than 2 intervals from a valid data point' away)
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
Finally use this as an identifier to replace desired values in x
replace(x, tmp > 1, 0)
# [1] 3 4 NA 0 NA 3 3 NA 3 NA 0 0 NA 1
Written as a function
f <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
f(x)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56692176%2ffill-nas-in-r-with-zero-if-the-next-valid-data-point-is-more-than-2-intervals-aw%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
Maybe there are simpler solutions but this one works.
na2zero <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
na2zero(x)
#[1] 3 4 NA 0 NA 3 3
X <- list(x, c(x, x), c(3, 4, NA, NA, NA, NA, 3, 3))
lapply(X, na2zero)
add a comment |
Maybe there are simpler solutions but this one works.
na2zero <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
na2zero(x)
#[1] 3 4 NA 0 NA 3 3
X <- list(x, c(x, x), c(3, 4, NA, NA, NA, NA, 3, 3))
lapply(X, na2zero)
add a comment |
Maybe there are simpler solutions but this one works.
na2zero <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
na2zero(x)
#[1] 3 4 NA 0 NA 3 3
X <- list(x, c(x, x), c(3, 4, NA, NA, NA, NA, 3, 3))
lapply(X, na2zero)
Maybe there are simpler solutions but this one works.
na2zero <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
na2zero(x)
#[1] 3 4 NA 0 NA 3 3
X <- list(x, c(x, x), c(3, 4, NA, NA, NA, NA, 3, 3))
lapply(X, na2zero)
answered 9 hours ago
Rui BarradasRui Barradas
20.2k6 gold badges19 silver badges35 bronze badges
20.2k6 gold badges19 silver badges35 bronze badges
add a comment |
add a comment |
Update -
Here's probably one of the simplest and fastest solutions (Thanks to answer from G. Grothendieck). Simply knowing whether the value is NA
on either side of any NA
is sufficient information. Therefore, using lead
and lag
from dplyr
package -
na2zero <- function(x) {
x[is.na(lag(x, 1, 0)) & is.na(lead(x, 1, 0)) & is.na(x)] <- 0
x
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L)))
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
Previous Answer (also fast) -
Here's one way using rle
and replace
from base R. This method turns every NA
, that is not an endpoint in the running length, into a 0
-
na2zero <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
Updated Benchmarks -
set.seed(2)
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
microbenchmark(
Rui(x),
Shree_old(x), Shree_new(x),
markus(x),
IceCreamT(x),
Uwe1(x), Uwe2(x), Uwe_Reduce(x),
Grothendieck(x),
times = 50
)
all.equal(Shree_new(x), Rui(x)) # [1] TRUE
all.equal(Shree_new(x), Shree_old(x)) # [1] TRUE
all.equal(Shree_new(x), markus(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe1(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe2(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe_Reduce(x)) # [1] TRUE
all.equal(Shree_new(x), Grothendieck(x)) # [1] TRUE
Unit: milliseconds
expr min lq mean median uq max neval
Rui(x) 286.026540 307.586604 342.620266 318.404731 363.844258 518.03330 50
Shree_old(x) 51.556489 62.038875 85.348031 65.012384 81.882141 327.57514 50
Shree_new(x) 3.996918 4.258248 17.210709 6.298946 10.335142 207.14732 50
markus(x) 853.513854 885.419719 1001.450726 919.930389 1018.353847 1642.25435 50
IceCreamT(x) 12.162079 13.773873 22.555446 15.021700 21.271498 199.08993 50
Uwe1(x) 162.536980 183.566490 225.801038 196.882049 269.020395 439.17737 50
Uwe2(x) 83.582360 93.136277 115.608342 99.165997 115.376903 309.67290 50
Uwe_Reduce(x) 1.732195 1.871940 4.215195 2.016815 4.842883 25.91542 50
Grothendieck(x) 620.814291 688.107779 767.749387 746.699435 850.442643 982.49094 50
While this works for the provided example, I don't know that it gives the desired behavior for say,x <- c(3, 4, NA, NA, NA, NA, 3, 3)
.
– Mako212
9 hours ago
@Mako212 It gives[1] 3 4 NA NA 0 NA 3 3
which seems correct. Am I missing something?
– Shree
9 hours ago
1
Not sure, the way I read OP's question I'd expect[1] 3 4 NA 0 0 NA 3 3
to be the desired result, but that's just how I interpreted it.
– Mako212
9 hours ago
Nah, the first 0 in your output is > 2 away from nearest valid point.
– Shree
9 hours ago
1
The question saysmore than 2 intervals from a valid data point
– Mako212
9 hours ago
|
show 1 more comment
Update -
Here's probably one of the simplest and fastest solutions (Thanks to answer from G. Grothendieck). Simply knowing whether the value is NA
on either side of any NA
is sufficient information. Therefore, using lead
and lag
from dplyr
package -
na2zero <- function(x) {
x[is.na(lag(x, 1, 0)) & is.na(lead(x, 1, 0)) & is.na(x)] <- 0
x
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L)))
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
Previous Answer (also fast) -
Here's one way using rle
and replace
from base R. This method turns every NA
, that is not an endpoint in the running length, into a 0
-
na2zero <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
Updated Benchmarks -
set.seed(2)
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
microbenchmark(
Rui(x),
Shree_old(x), Shree_new(x),
markus(x),
IceCreamT(x),
Uwe1(x), Uwe2(x), Uwe_Reduce(x),
Grothendieck(x),
times = 50
)
all.equal(Shree_new(x), Rui(x)) # [1] TRUE
all.equal(Shree_new(x), Shree_old(x)) # [1] TRUE
all.equal(Shree_new(x), markus(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe1(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe2(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe_Reduce(x)) # [1] TRUE
all.equal(Shree_new(x), Grothendieck(x)) # [1] TRUE
Unit: milliseconds
expr min lq mean median uq max neval
Rui(x) 286.026540 307.586604 342.620266 318.404731 363.844258 518.03330 50
Shree_old(x) 51.556489 62.038875 85.348031 65.012384 81.882141 327.57514 50
Shree_new(x) 3.996918 4.258248 17.210709 6.298946 10.335142 207.14732 50
markus(x) 853.513854 885.419719 1001.450726 919.930389 1018.353847 1642.25435 50
IceCreamT(x) 12.162079 13.773873 22.555446 15.021700 21.271498 199.08993 50
Uwe1(x) 162.536980 183.566490 225.801038 196.882049 269.020395 439.17737 50
Uwe2(x) 83.582360 93.136277 115.608342 99.165997 115.376903 309.67290 50
Uwe_Reduce(x) 1.732195 1.871940 4.215195 2.016815 4.842883 25.91542 50
Grothendieck(x) 620.814291 688.107779 767.749387 746.699435 850.442643 982.49094 50
While this works for the provided example, I don't know that it gives the desired behavior for say,x <- c(3, 4, NA, NA, NA, NA, 3, 3)
.
– Mako212
9 hours ago
@Mako212 It gives[1] 3 4 NA NA 0 NA 3 3
which seems correct. Am I missing something?
– Shree
9 hours ago
1
Not sure, the way I read OP's question I'd expect[1] 3 4 NA 0 0 NA 3 3
to be the desired result, but that's just how I interpreted it.
– Mako212
9 hours ago
Nah, the first 0 in your output is > 2 away from nearest valid point.
– Shree
9 hours ago
1
The question saysmore than 2 intervals from a valid data point
– Mako212
9 hours ago
|
show 1 more comment
Update -
Here's probably one of the simplest and fastest solutions (Thanks to answer from G. Grothendieck). Simply knowing whether the value is NA
on either side of any NA
is sufficient information. Therefore, using lead
and lag
from dplyr
package -
na2zero <- function(x) {
x[is.na(lag(x, 1, 0)) & is.na(lead(x, 1, 0)) & is.na(x)] <- 0
x
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L)))
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
Previous Answer (also fast) -
Here's one way using rle
and replace
from base R. This method turns every NA
, that is not an endpoint in the running length, into a 0
-
na2zero <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
Updated Benchmarks -
set.seed(2)
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
microbenchmark(
Rui(x),
Shree_old(x), Shree_new(x),
markus(x),
IceCreamT(x),
Uwe1(x), Uwe2(x), Uwe_Reduce(x),
Grothendieck(x),
times = 50
)
all.equal(Shree_new(x), Rui(x)) # [1] TRUE
all.equal(Shree_new(x), Shree_old(x)) # [1] TRUE
all.equal(Shree_new(x), markus(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe1(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe2(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe_Reduce(x)) # [1] TRUE
all.equal(Shree_new(x), Grothendieck(x)) # [1] TRUE
Unit: milliseconds
expr min lq mean median uq max neval
Rui(x) 286.026540 307.586604 342.620266 318.404731 363.844258 518.03330 50
Shree_old(x) 51.556489 62.038875 85.348031 65.012384 81.882141 327.57514 50
Shree_new(x) 3.996918 4.258248 17.210709 6.298946 10.335142 207.14732 50
markus(x) 853.513854 885.419719 1001.450726 919.930389 1018.353847 1642.25435 50
IceCreamT(x) 12.162079 13.773873 22.555446 15.021700 21.271498 199.08993 50
Uwe1(x) 162.536980 183.566490 225.801038 196.882049 269.020395 439.17737 50
Uwe2(x) 83.582360 93.136277 115.608342 99.165997 115.376903 309.67290 50
Uwe_Reduce(x) 1.732195 1.871940 4.215195 2.016815 4.842883 25.91542 50
Grothendieck(x) 620.814291 688.107779 767.749387 746.699435 850.442643 982.49094 50
Update -
Here's probably one of the simplest and fastest solutions (Thanks to answer from G. Grothendieck). Simply knowing whether the value is NA
on either side of any NA
is sufficient information. Therefore, using lead
and lag
from dplyr
package -
na2zero <- function(x) {
x[is.na(lag(x, 1, 0)) & is.na(lead(x, 1, 0)) & is.na(x)] <- 0
x
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L)))
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
Previous Answer (also fast) -
Here's one way using rle
and replace
from base R. This method turns every NA
, that is not an endpoint in the running length, into a 0
-
na2zero <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
na2zero(x = c(3, 4, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 NA 3 3
na2zero(x = c(3, 4, NA, NA, NA, NA, NA, 3, 3))
[1] 3 4 NA 0 0 0 NA 3 3
Updated Benchmarks -
set.seed(2)
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
microbenchmark(
Rui(x),
Shree_old(x), Shree_new(x),
markus(x),
IceCreamT(x),
Uwe1(x), Uwe2(x), Uwe_Reduce(x),
Grothendieck(x),
times = 50
)
all.equal(Shree_new(x), Rui(x)) # [1] TRUE
all.equal(Shree_new(x), Shree_old(x)) # [1] TRUE
all.equal(Shree_new(x), markus(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe1(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe2(x)) # [1] TRUE
all.equal(Shree_new(x), Uwe_Reduce(x)) # [1] TRUE
all.equal(Shree_new(x), Grothendieck(x)) # [1] TRUE
Unit: milliseconds
expr min lq mean median uq max neval
Rui(x) 286.026540 307.586604 342.620266 318.404731 363.844258 518.03330 50
Shree_old(x) 51.556489 62.038875 85.348031 65.012384 81.882141 327.57514 50
Shree_new(x) 3.996918 4.258248 17.210709 6.298946 10.335142 207.14732 50
markus(x) 853.513854 885.419719 1001.450726 919.930389 1018.353847 1642.25435 50
IceCreamT(x) 12.162079 13.773873 22.555446 15.021700 21.271498 199.08993 50
Uwe1(x) 162.536980 183.566490 225.801038 196.882049 269.020395 439.17737 50
Uwe2(x) 83.582360 93.136277 115.608342 99.165997 115.376903 309.67290 50
Uwe_Reduce(x) 1.732195 1.871940 4.215195 2.016815 4.842883 25.91542 50
Grothendieck(x) 620.814291 688.107779 767.749387 746.699435 850.442643 982.49094 50
edited 4 hours ago
answered 9 hours ago
ShreeShree
5,0371 gold badge7 silver badges26 bronze badges
5,0371 gold badge7 silver badges26 bronze badges
While this works for the provided example, I don't know that it gives the desired behavior for say,x <- c(3, 4, NA, NA, NA, NA, 3, 3)
.
– Mako212
9 hours ago
@Mako212 It gives[1] 3 4 NA NA 0 NA 3 3
which seems correct. Am I missing something?
– Shree
9 hours ago
1
Not sure, the way I read OP's question I'd expect[1] 3 4 NA 0 0 NA 3 3
to be the desired result, but that's just how I interpreted it.
– Mako212
9 hours ago
Nah, the first 0 in your output is > 2 away from nearest valid point.
– Shree
9 hours ago
1
The question saysmore than 2 intervals from a valid data point
– Mako212
9 hours ago
|
show 1 more comment
While this works for the provided example, I don't know that it gives the desired behavior for say,x <- c(3, 4, NA, NA, NA, NA, 3, 3)
.
– Mako212
9 hours ago
@Mako212 It gives[1] 3 4 NA NA 0 NA 3 3
which seems correct. Am I missing something?
– Shree
9 hours ago
1
Not sure, the way I read OP's question I'd expect[1] 3 4 NA 0 0 NA 3 3
to be the desired result, but that's just how I interpreted it.
– Mako212
9 hours ago
Nah, the first 0 in your output is > 2 away from nearest valid point.
– Shree
9 hours ago
1
The question saysmore than 2 intervals from a valid data point
– Mako212
9 hours ago
While this works for the provided example, I don't know that it gives the desired behavior for say,
x <- c(3, 4, NA, NA, NA, NA, 3, 3)
.– Mako212
9 hours ago
While this works for the provided example, I don't know that it gives the desired behavior for say,
x <- c(3, 4, NA, NA, NA, NA, 3, 3)
.– Mako212
9 hours ago
@Mako212 It gives
[1] 3 4 NA NA 0 NA 3 3
which seems correct. Am I missing something?– Shree
9 hours ago
@Mako212 It gives
[1] 3 4 NA NA 0 NA 3 3
which seems correct. Am I missing something?– Shree
9 hours ago
1
1
Not sure, the way I read OP's question I'd expect
[1] 3 4 NA 0 0 NA 3 3
to be the desired result, but that's just how I interpreted it.– Mako212
9 hours ago
Not sure, the way I read OP's question I'd expect
[1] 3 4 NA 0 0 NA 3 3
to be the desired result, but that's just how I interpreted it.– Mako212
9 hours ago
Nah, the first 0 in your output is > 2 away from nearest valid point.
– Shree
9 hours ago
Nah, the first 0 in your output is > 2 away from nearest valid point.
– Shree
9 hours ago
1
1
The question says
more than 2 intervals from a valid data point
– Mako212
9 hours ago
The question says
more than 2 intervals from a valid data point
– Mako212
9 hours ago
|
show 1 more comment
Here's a data.table option
library(data.table)
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Benchmark and equality comparison.
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
all.equal(Rui(x), na0_dt(x))
# [1] TRUE
all.equal(Rui(x), markus(x))
# [1] TRUE
all.equal(Rui(x), Shree(x))
# [1] TRUE
all.equal(Rui(x), Uwe1(x))
# [1] TRUE
all.equal(Rui(x), Uwe2(x))
# [1] TRUE
library(microbenchmark)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# Rui(x) 262.80052 286.75909 323.24804 300.73305 344.75091 488.45805 50 d
# Shree(x) 55.30956 64.63756 82.52067 72.70157 85.11101 367.10743 50 b
# markus(x) 807.05592 882.74498 1007.31244 958.71506 1075.13111 1590.34609 50 e
# na0_dt(x) 14.95426 16.75980 19.37040 18.53108 19.99180 38.43161 50 a
# Uwe1(x) 164.20027 178.59412 210.47365 198.76931 224.98389 327.70639 50 c
# Uwe2(x) 85.33296 97.42117 110.56380 102.55040 115.19759 192.73526 50 b
Again with length 1e3
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e3, T)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: microseconds
# expr min lq mean median uq max neval cld
# Rui(x) 2543.180 2668.719 3084.0056 2847.386 3286.566 4820.105 50 bc
# Shree(x) 530.462 584.206 651.1348 600.821 653.949 1170.052 50 a
# markus(x) 7036.721 7604.517 9467.0319 8507.491 11014.979 24400.421 50 d
# na0_dt(x) 1008.001 1092.513 1179.9802 1160.001 1247.591 1564.308 50 a
# Uwe1(x) 2907.078 3190.976 3773.5566 3424.207 4162.873 9388.723 50 c
# Uwe2(x) 2299.489 2566.155 2803.8250 2703.386 2996.104 4038.155 50 b
Functions from other answers used in this benchmark
Rui <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
Shree <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
markus <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Uwe1 <- function(x){
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
}
Uwe2 <- function(x){
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
}
1
Interestingly, my base R solution seems faster forx
of length1e3
. I have no idea howdata.table
works but seems like it has some initial overhead after which it just takes off! Therefore longer the vector more it can compensate.
– Shree
7 hours ago
1
That's interesting, I added that case to the answer. I thought maybe the flipping-of-speeds would be because of the[.data.table
overhead but it persists even after rewriting my function to avoid[.data.table
– IceCreamToucan
7 hours ago
Your benchmark makes me feel I should downvote my answer. Tried it, but it's not possibly unfortunately.
– markus
7 hours ago
Ha, that wasn't my intention. Even for a vector of length 1e5 the times are listed in milliseconds, so the timing isn't all that important for most cases probably.
– IceCreamToucan
7 hours ago
add a comment |
Here's a data.table option
library(data.table)
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Benchmark and equality comparison.
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
all.equal(Rui(x), na0_dt(x))
# [1] TRUE
all.equal(Rui(x), markus(x))
# [1] TRUE
all.equal(Rui(x), Shree(x))
# [1] TRUE
all.equal(Rui(x), Uwe1(x))
# [1] TRUE
all.equal(Rui(x), Uwe2(x))
# [1] TRUE
library(microbenchmark)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# Rui(x) 262.80052 286.75909 323.24804 300.73305 344.75091 488.45805 50 d
# Shree(x) 55.30956 64.63756 82.52067 72.70157 85.11101 367.10743 50 b
# markus(x) 807.05592 882.74498 1007.31244 958.71506 1075.13111 1590.34609 50 e
# na0_dt(x) 14.95426 16.75980 19.37040 18.53108 19.99180 38.43161 50 a
# Uwe1(x) 164.20027 178.59412 210.47365 198.76931 224.98389 327.70639 50 c
# Uwe2(x) 85.33296 97.42117 110.56380 102.55040 115.19759 192.73526 50 b
Again with length 1e3
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e3, T)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: microseconds
# expr min lq mean median uq max neval cld
# Rui(x) 2543.180 2668.719 3084.0056 2847.386 3286.566 4820.105 50 bc
# Shree(x) 530.462 584.206 651.1348 600.821 653.949 1170.052 50 a
# markus(x) 7036.721 7604.517 9467.0319 8507.491 11014.979 24400.421 50 d
# na0_dt(x) 1008.001 1092.513 1179.9802 1160.001 1247.591 1564.308 50 a
# Uwe1(x) 2907.078 3190.976 3773.5566 3424.207 4162.873 9388.723 50 c
# Uwe2(x) 2299.489 2566.155 2803.8250 2703.386 2996.104 4038.155 50 b
Functions from other answers used in this benchmark
Rui <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
Shree <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
markus <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Uwe1 <- function(x){
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
}
Uwe2 <- function(x){
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
}
1
Interestingly, my base R solution seems faster forx
of length1e3
. I have no idea howdata.table
works but seems like it has some initial overhead after which it just takes off! Therefore longer the vector more it can compensate.
– Shree
7 hours ago
1
That's interesting, I added that case to the answer. I thought maybe the flipping-of-speeds would be because of the[.data.table
overhead but it persists even after rewriting my function to avoid[.data.table
– IceCreamToucan
7 hours ago
Your benchmark makes me feel I should downvote my answer. Tried it, but it's not possibly unfortunately.
– markus
7 hours ago
Ha, that wasn't my intention. Even for a vector of length 1e5 the times are listed in milliseconds, so the timing isn't all that important for most cases probably.
– IceCreamToucan
7 hours ago
add a comment |
Here's a data.table option
library(data.table)
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Benchmark and equality comparison.
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
all.equal(Rui(x), na0_dt(x))
# [1] TRUE
all.equal(Rui(x), markus(x))
# [1] TRUE
all.equal(Rui(x), Shree(x))
# [1] TRUE
all.equal(Rui(x), Uwe1(x))
# [1] TRUE
all.equal(Rui(x), Uwe2(x))
# [1] TRUE
library(microbenchmark)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# Rui(x) 262.80052 286.75909 323.24804 300.73305 344.75091 488.45805 50 d
# Shree(x) 55.30956 64.63756 82.52067 72.70157 85.11101 367.10743 50 b
# markus(x) 807.05592 882.74498 1007.31244 958.71506 1075.13111 1590.34609 50 e
# na0_dt(x) 14.95426 16.75980 19.37040 18.53108 19.99180 38.43161 50 a
# Uwe1(x) 164.20027 178.59412 210.47365 198.76931 224.98389 327.70639 50 c
# Uwe2(x) 85.33296 97.42117 110.56380 102.55040 115.19759 192.73526 50 b
Again with length 1e3
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e3, T)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: microseconds
# expr min lq mean median uq max neval cld
# Rui(x) 2543.180 2668.719 3084.0056 2847.386 3286.566 4820.105 50 bc
# Shree(x) 530.462 584.206 651.1348 600.821 653.949 1170.052 50 a
# markus(x) 7036.721 7604.517 9467.0319 8507.491 11014.979 24400.421 50 d
# na0_dt(x) 1008.001 1092.513 1179.9802 1160.001 1247.591 1564.308 50 a
# Uwe1(x) 2907.078 3190.976 3773.5566 3424.207 4162.873 9388.723 50 c
# Uwe2(x) 2299.489 2566.155 2803.8250 2703.386 2996.104 4038.155 50 b
Functions from other answers used in this benchmark
Rui <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
Shree <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
markus <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Uwe1 <- function(x){
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
}
Uwe2 <- function(x){
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
}
Here's a data.table option
library(data.table)
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Benchmark and equality comparison.
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e5, T)
all.equal(Rui(x), na0_dt(x))
# [1] TRUE
all.equal(Rui(x), markus(x))
# [1] TRUE
all.equal(Rui(x), Shree(x))
# [1] TRUE
all.equal(Rui(x), Uwe1(x))
# [1] TRUE
all.equal(Rui(x), Uwe2(x))
# [1] TRUE
library(microbenchmark)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# Rui(x) 262.80052 286.75909 323.24804 300.73305 344.75091 488.45805 50 d
# Shree(x) 55.30956 64.63756 82.52067 72.70157 85.11101 367.10743 50 b
# markus(x) 807.05592 882.74498 1007.31244 958.71506 1075.13111 1590.34609 50 e
# na0_dt(x) 14.95426 16.75980 19.37040 18.53108 19.99180 38.43161 50 a
# Uwe1(x) 164.20027 178.59412 210.47365 198.76931 224.98389 327.70639 50 c
# Uwe2(x) 85.33296 97.42117 110.56380 102.55040 115.19759 192.73526 50 b
Again with length 1e3
x <- c(3, 4, NA, NA, NA, 3, 3)
x <- sample(x, 1e3, T)
microbenchmark(
Rui(x),
Shree(x),
markus(x),
na0_dt(x),
Uwe1(x),
Uwe2(x),
times = 50
)
# Unit: microseconds
# expr min lq mean median uq max neval cld
# Rui(x) 2543.180 2668.719 3084.0056 2847.386 3286.566 4820.105 50 bc
# Shree(x) 530.462 584.206 651.1348 600.821 653.949 1170.052 50 a
# markus(x) 7036.721 7604.517 9467.0319 8507.491 11014.979 24400.421 50 d
# na0_dt(x) 1008.001 1092.513 1179.9802 1160.001 1247.591 1564.308 50 a
# Uwe1(x) 2907.078 3190.976 3773.5566 3424.207 4162.873 9388.723 50 c
# Uwe2(x) 2299.489 2566.155 2803.8250 2703.386 2996.104 4038.155 50 b
Functions from other answers used in this benchmark
Rui <- function(x){
ave(x, cumsum(abs(c(0, diff(is.na(x))))), FUN = function(y){
if(anyNA(y)){
if(length(y) > 2) y[-c(1, length(y))] <- 0
}
y
})
}
Shree <- function(x) {
run_lengths <- rle(is.na(x))$lengths
replace(x,
sequence(run_lengths) != 1 &
sequence(run_lengths) != rep(run_lengths, run_lengths) &
is.na(x),
0)
}
markus <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
na0_dt <- function(x){
replace(x, rowid(r <- rleid(xna <- is.na(x))) > 1 & rev(rowid(rev(r))) > 1 & xna, 0)
}
Uwe1 <- function(x){
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
}
Uwe2 <- function(x){
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
}
edited 7 hours ago
answered 8 hours ago
IceCreamToucanIceCreamToucan
12.3k1 gold badge8 silver badges19 bronze badges
12.3k1 gold badge8 silver badges19 bronze badges
1
Interestingly, my base R solution seems faster forx
of length1e3
. I have no idea howdata.table
works but seems like it has some initial overhead after which it just takes off! Therefore longer the vector more it can compensate.
– Shree
7 hours ago
1
That's interesting, I added that case to the answer. I thought maybe the flipping-of-speeds would be because of the[.data.table
overhead but it persists even after rewriting my function to avoid[.data.table
– IceCreamToucan
7 hours ago
Your benchmark makes me feel I should downvote my answer. Tried it, but it's not possibly unfortunately.
– markus
7 hours ago
Ha, that wasn't my intention. Even for a vector of length 1e5 the times are listed in milliseconds, so the timing isn't all that important for most cases probably.
– IceCreamToucan
7 hours ago
add a comment |
1
Interestingly, my base R solution seems faster forx
of length1e3
. I have no idea howdata.table
works but seems like it has some initial overhead after which it just takes off! Therefore longer the vector more it can compensate.
– Shree
7 hours ago
1
That's interesting, I added that case to the answer. I thought maybe the flipping-of-speeds would be because of the[.data.table
overhead but it persists even after rewriting my function to avoid[.data.table
– IceCreamToucan
7 hours ago
Your benchmark makes me feel I should downvote my answer. Tried it, but it's not possibly unfortunately.
– markus
7 hours ago
Ha, that wasn't my intention. Even for a vector of length 1e5 the times are listed in milliseconds, so the timing isn't all that important for most cases probably.
– IceCreamToucan
7 hours ago
1
1
Interestingly, my base R solution seems faster for
x
of length 1e3
. I have no idea how data.table
works but seems like it has some initial overhead after which it just takes off! Therefore longer the vector more it can compensate.– Shree
7 hours ago
Interestingly, my base R solution seems faster for
x
of length 1e3
. I have no idea how data.table
works but seems like it has some initial overhead after which it just takes off! Therefore longer the vector more it can compensate.– Shree
7 hours ago
1
1
That's interesting, I added that case to the answer. I thought maybe the flipping-of-speeds would be because of the
[.data.table
overhead but it persists even after rewriting my function to avoid [.data.table
– IceCreamToucan
7 hours ago
That's interesting, I added that case to the answer. I thought maybe the flipping-of-speeds would be because of the
[.data.table
overhead but it persists even after rewriting my function to avoid [.data.table
– IceCreamToucan
7 hours ago
Your benchmark makes me feel I should downvote my answer. Tried it, but it's not possibly unfortunately.
– markus
7 hours ago
Your benchmark makes me feel I should downvote my answer. Tried it, but it's not possibly unfortunately.
– markus
7 hours ago
Ha, that wasn't my intention. Even for a vector of length 1e5 the times are listed in milliseconds, so the timing isn't all that important for most cases probably.
– IceCreamToucan
7 hours ago
Ha, that wasn't my intention. Even for a vector of length 1e5 the times are listed in milliseconds, so the timing isn't all that important for most cases probably.
– IceCreamToucan
7 hours ago
add a comment |
For the sake of completeness, here are three other data.table approaches:
x <- c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L))
library(data.table)
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
shift()
& Reduce()
I was so focused on finding the right way to create groups that I started to think about the straightforward approach rather late. The rule is quite simple:
Replace all NAs by zero which are preceeded and succeeded by another NA.
This can be accomplished by zoo::rollapply()
as in G. Grothendieck's answer or by using lag()
& lead()
like in Shree's latest edit.
However, my own benchmark (not posted here to avoid duplication with Shree' benchmark) shows that data.table::shift()
and Reduce()
is the fastest method so far.
isnax <- is.na(x)
x[Reduce(`&`, data.table::shift(isnax, -1:1))] <- 0
x
It is also slightly faster than using lag()
& lead()
(please, note that this differs from Shree's version as is.na()
is only called once):
isnax <- is.na(x)
x[isnax & dplyr::lag(isnax) & dplyr::lead(isnax)] <- 0
x
Indeed, it is faster. Nice! I'll add it to my benchmarks.
– Shree
4 hours ago
2
@Shree, thank you for benchmarking all the different solutions. BTW: I have switched frommicrobenchmark
tobench
for doing benchmarks because it allows to vary problem sizes easily and to create charts. (I decided not to post the charts because you took the burden to do all the benchmarks, already.)
– Uwe
4 hours ago
add a comment |
For the sake of completeness, here are three other data.table approaches:
x <- c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L))
library(data.table)
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
shift()
& Reduce()
I was so focused on finding the right way to create groups that I started to think about the straightforward approach rather late. The rule is quite simple:
Replace all NAs by zero which are preceeded and succeeded by another NA.
This can be accomplished by zoo::rollapply()
as in G. Grothendieck's answer or by using lag()
& lead()
like in Shree's latest edit.
However, my own benchmark (not posted here to avoid duplication with Shree' benchmark) shows that data.table::shift()
and Reduce()
is the fastest method so far.
isnax <- is.na(x)
x[Reduce(`&`, data.table::shift(isnax, -1:1))] <- 0
x
It is also slightly faster than using lag()
& lead()
(please, note that this differs from Shree's version as is.na()
is only called once):
isnax <- is.na(x)
x[isnax & dplyr::lag(isnax) & dplyr::lead(isnax)] <- 0
x
Indeed, it is faster. Nice! I'll add it to my benchmarks.
– Shree
4 hours ago
2
@Shree, thank you for benchmarking all the different solutions. BTW: I have switched frommicrobenchmark
tobench
for doing benchmarks because it allows to vary problem sizes easily and to create charts. (I decided not to post the charts because you took the burden to do all the benchmarks, already.)
– Uwe
4 hours ago
add a comment |
For the sake of completeness, here are three other data.table approaches:
x <- c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L))
library(data.table)
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
shift()
& Reduce()
I was so focused on finding the right way to create groups that I started to think about the straightforward approach rather late. The rule is quite simple:
Replace all NAs by zero which are preceeded and succeeded by another NA.
This can be accomplished by zoo::rollapply()
as in G. Grothendieck's answer or by using lag()
& lead()
like in Shree's latest edit.
However, my own benchmark (not posted here to avoid duplication with Shree' benchmark) shows that data.table::shift()
and Reduce()
is the fastest method so far.
isnax <- is.na(x)
x[Reduce(`&`, data.table::shift(isnax, -1:1))] <- 0
x
It is also slightly faster than using lag()
& lead()
(please, note that this differs from Shree's version as is.na()
is only called once):
isnax <- is.na(x)
x[isnax & dplyr::lag(isnax) & dplyr::lead(isnax)] <- 0
x
For the sake of completeness, here are three other data.table approaches:
x <- c(3, 4, NA, NA, NA, 3, 3, NA, NA, 1, NA, 0, 0, rep(NA, 4L))
library(data.table)
data.table(x)[, x := replace(x, which(is.na(x))[-c(1L, .N)], 0), by =.(rleid(is.na(x)))]$x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
x[data.table(x)[, .I[is.na(x)][-c(1L, .N)], by =.(rleid(is.na(x)))]$V1] <- 0
x
[1] 3 4 NA 0 NA 3 3 NA NA 1 NA 0 0 NA 0 0 NA
shift()
& Reduce()
I was so focused on finding the right way to create groups that I started to think about the straightforward approach rather late. The rule is quite simple:
Replace all NAs by zero which are preceeded and succeeded by another NA.
This can be accomplished by zoo::rollapply()
as in G. Grothendieck's answer or by using lag()
& lead()
like in Shree's latest edit.
However, my own benchmark (not posted here to avoid duplication with Shree' benchmark) shows that data.table::shift()
and Reduce()
is the fastest method so far.
isnax <- is.na(x)
x[Reduce(`&`, data.table::shift(isnax, -1:1))] <- 0
x
It is also slightly faster than using lag()
& lead()
(please, note that this differs from Shree's version as is.na()
is only called once):
isnax <- is.na(x)
x[isnax & dplyr::lag(isnax) & dplyr::lead(isnax)] <- 0
x
edited 4 hours ago
answered 8 hours ago
UweUwe
25k7 gold badges56 silver badges78 bronze badges
25k7 gold badges56 silver badges78 bronze badges
Indeed, it is faster. Nice! I'll add it to my benchmarks.
– Shree
4 hours ago
2
@Shree, thank you for benchmarking all the different solutions. BTW: I have switched frommicrobenchmark
tobench
for doing benchmarks because it allows to vary problem sizes easily and to create charts. (I decided not to post the charts because you took the burden to do all the benchmarks, already.)
– Uwe
4 hours ago
add a comment |
Indeed, it is faster. Nice! I'll add it to my benchmarks.
– Shree
4 hours ago
2
@Shree, thank you for benchmarking all the different solutions. BTW: I have switched frommicrobenchmark
tobench
for doing benchmarks because it allows to vary problem sizes easily and to create charts. (I decided not to post the charts because you took the burden to do all the benchmarks, already.)
– Uwe
4 hours ago
Indeed, it is faster. Nice! I'll add it to my benchmarks.
– Shree
4 hours ago
Indeed, it is faster. Nice! I'll add it to my benchmarks.
– Shree
4 hours ago
2
2
@Shree, thank you for benchmarking all the different solutions. BTW: I have switched from
microbenchmark
to bench
for doing benchmarks because it allows to vary problem sizes easily and to create charts. (I decided not to post the charts because you took the burden to do all the benchmarks, already.)– Uwe
4 hours ago
@Shree, thank you for benchmarking all the different solutions. BTW: I have switched from
microbenchmark
to bench
for doing benchmarks because it allows to vary problem sizes easily and to create charts. (I decided not to post the charts because you took the burden to do all the benchmarks, already.)– Uwe
4 hours ago
add a comment |
Based on the example, I assume what you mean is that if the value is NA and adjacent values in both directions are NA (or in one direction if the value is first or last) then replace the value with 0. Using a centered rolling window of length 3 return TRUE if it is all NA and then replace the TRUE positions with 0. This gives the following one-liner
library(zoo)
replace(x, rollapply(c(TRUE, is.na(x), TRUE), 3, all), 0)
## [1] 3 4 NA 0 NA 3 3
zoo
has some pretty handy functions to work with missing values
– PavoDive
7 hours ago
add a comment |
Based on the example, I assume what you mean is that if the value is NA and adjacent values in both directions are NA (or in one direction if the value is first or last) then replace the value with 0. Using a centered rolling window of length 3 return TRUE if it is all NA and then replace the TRUE positions with 0. This gives the following one-liner
library(zoo)
replace(x, rollapply(c(TRUE, is.na(x), TRUE), 3, all), 0)
## [1] 3 4 NA 0 NA 3 3
zoo
has some pretty handy functions to work with missing values
– PavoDive
7 hours ago
add a comment |
Based on the example, I assume what you mean is that if the value is NA and adjacent values in both directions are NA (or in one direction if the value is first or last) then replace the value with 0. Using a centered rolling window of length 3 return TRUE if it is all NA and then replace the TRUE positions with 0. This gives the following one-liner
library(zoo)
replace(x, rollapply(c(TRUE, is.na(x), TRUE), 3, all), 0)
## [1] 3 4 NA 0 NA 3 3
Based on the example, I assume what you mean is that if the value is NA and adjacent values in both directions are NA (or in one direction if the value is first or last) then replace the value with 0. Using a centered rolling window of length 3 return TRUE if it is all NA and then replace the TRUE positions with 0. This gives the following one-liner
library(zoo)
replace(x, rollapply(c(TRUE, is.na(x), TRUE), 3, all), 0)
## [1] 3 4 NA 0 NA 3 3
edited 7 hours ago
answered 7 hours ago
G. GrothendieckG. Grothendieck
159k11 gold badges143 silver badges251 bronze badges
159k11 gold badges143 silver badges251 bronze badges
zoo
has some pretty handy functions to work with missing values
– PavoDive
7 hours ago
add a comment |
zoo
has some pretty handy functions to work with missing values
– PavoDive
7 hours ago
zoo
has some pretty handy functions to work with missing values– PavoDive
7 hours ago
zoo
has some pretty handy functions to work with missing values– PavoDive
7 hours ago
add a comment |
Another base approach
x <- c(3, 4, NA, NA, NA, 3, 3, NA, 3, NA, NA, NA, NA, 1)
Create a grouping variable
grp <- with(rle(is.na(x)), rep(seq_along(lengths), lengths)) # same as rleid(is.na(x))
For each group calculate the parallel minimum of cumsum(is.na(x))
and its reverse (which will be greater than one for values 'which are more than 2 intervals from a valid data point' away)
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
Finally use this as an identifier to replace desired values in x
replace(x, tmp > 1, 0)
# [1] 3 4 NA 0 NA 3 3 NA 3 NA 0 0 NA 1
Written as a function
f <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
f(x)
add a comment |
Another base approach
x <- c(3, 4, NA, NA, NA, 3, 3, NA, 3, NA, NA, NA, NA, 1)
Create a grouping variable
grp <- with(rle(is.na(x)), rep(seq_along(lengths), lengths)) # same as rleid(is.na(x))
For each group calculate the parallel minimum of cumsum(is.na(x))
and its reverse (which will be greater than one for values 'which are more than 2 intervals from a valid data point' away)
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
Finally use this as an identifier to replace desired values in x
replace(x, tmp > 1, 0)
# [1] 3 4 NA 0 NA 3 3 NA 3 NA 0 0 NA 1
Written as a function
f <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
f(x)
add a comment |
Another base approach
x <- c(3, 4, NA, NA, NA, 3, 3, NA, 3, NA, NA, NA, NA, 1)
Create a grouping variable
grp <- with(rle(is.na(x)), rep(seq_along(lengths), lengths)) # same as rleid(is.na(x))
For each group calculate the parallel minimum of cumsum(is.na(x))
and its reverse (which will be greater than one for values 'which are more than 2 intervals from a valid data point' away)
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
Finally use this as an identifier to replace desired values in x
replace(x, tmp > 1, 0)
# [1] 3 4 NA 0 NA 3 3 NA 3 NA 0 0 NA 1
Written as a function
f <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
f(x)
Another base approach
x <- c(3, 4, NA, NA, NA, 3, 3, NA, 3, NA, NA, NA, NA, 1)
Create a grouping variable
grp <- with(rle(is.na(x)), rep(seq_along(lengths), lengths)) # same as rleid(is.na(x))
For each group calculate the parallel minimum of cumsum(is.na(x))
and its reverse (which will be greater than one for values 'which are more than 2 intervals from a valid data point' away)
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
Finally use this as an identifier to replace desired values in x
replace(x, tmp > 1, 0)
# [1] 3 4 NA 0 NA 3 3 NA 3 NA 0 0 NA 1
Written as a function
f <- function(x, gap = 1) {
stopifnot(gap >= 0)
if (gap == 0) {
x[is.na(x)] <- 0
x } else {
grp <- with(rle(is.na(x)), rep(seq_along(lengths), times = lengths))
tmp <- ave(x, grp, FUN = function(x) pmin(cumsum(is.na(x)), rev(cumsum(is.na(x)))))
replace(x, tmp > gap, 0)
}
}
f(x)
edited 8 hours ago
answered 9 hours ago
markusmarkus
17.6k2 gold badges16 silver badges39 bronze badges
17.6k2 gold badges16 silver badges39 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56692176%2ffill-nas-in-r-with-zero-if-the-next-valid-data-point-is-more-than-2-intervals-aw%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown