I'm trying to integrate similar
into https://github.com/helix-editor/helix/pull/228, however I've been having quite a lot of difficulty in mapping at which index in the &str
the Change
should be applied. I'm trying to use Change::old_index()
and Change::new_index()
as it sounds like that's what would help me, but it keeps panicking because it unwrap
s a None
value.
The text is a ropey::Rope
, and I'm iterating over it with ropey::iter::Chunks
, and then decoding it to a user-selected encoding through encoding_rs
, though it is UTF-8 the entire way through.
/// (from, to, replacement)
pub type Change = (usize, usize, Option<Tendril>); // https://docs.rs/tendril/0.4.2/tendril/struct.Tendril.html
let iter = self.text.chunks(); // ropey::Rope::chunks()
let iter_len = iter.clone().count();
let mut decoder = encoding.new_decoder(); // encoding_rs::Encoding::new_decoder()
let mut changes: Vec<Change> = Vec::new();
for (i, chunk) in iter.enumerate() {
// Check if this is the last element in the iterator.
let is_last = i == iter_len - 1;
let capacity = Self::calculate_decode_capacity(&mut decoder, chunk.as_bytes());
let mut buf = String::with_capacity(capacity);
let mut total_read = 0;
// Loop until the entire chunk has been decoded into `buf`.
loop {
let (result, read, ..) =
decoder.decode_to_string(chunk[total_read..].as_bytes(), &mut buf, is_last);
// Track how many bytes we have read so far, in case we need to allocate more
// capacity to `buf`.
total_read += read;
// Check if we need to allocate more capacity to `buf`, otherwise append
// to `changes`.
match result {
encoding_rs::CoderResult::InputEmpty => {
debug_assert_eq!(total_read, chunk.len());
let diff = similar::TextDiff::from_unicode_words(chunk, &buf);
let diff_ops = diff.ops();
let diff_changes = diff_ops
.iter()
.flat_map(|x| diff.iter_changes(x))
.filter_map(|x| {
let index = x.old_index().unwrap_or(x.new_index().unwrap());
let value = x.value();
match x.tag() {
similar::ChangeTag::Delete => {
Some((index, index + value.chars().count(), None))
}
similar::ChangeTag::Insert => {
Some((index, index + value.chars().count(), Some(value.into())))
}
similar::ChangeTag::Equal => None,
}
});
changes.extend(diff_changes);
break;
}
encoding_rs::CoderResult::OutputFull => {
debug_assert!(buf.len() > total_read);
let needed_capacity =
Self::calculate_decode_capacity(&mut decoder, chunk[total_read..].as_bytes());
buf.reserve(needed_capacity);
}
}
}
if is_last {
break;
}
}
The code is logically incorrect, such as not keeping the index
relative to the overall ropey::Rope
rather than the Chunk
, but I don't think it matters in regards to the unwrap
ping problem.