c# - Design Guidelines for Heavy string operations -
in current project, working on word processing & highlighting solution. typically every call class processing approximately 4000-5000 words long strings.
the structure looks this
public string parsetext(string inn) { //heavy parsing } public string highlighttext(string inn) { var cleaned = cleanuptext(inn); var parsed = parsetext(cleaned ); //highlight algorithm var formatted= parsetext(parsed); return formatted; } public string cleanuptext(string inn) { //heavy parsing } public string formattext(string inn) { //heavy parsing }
so feel passing around same string around different methods decreases performance. though string reference type, creates new string , allocates new memory each passed variable due immutability nature of it. different other user generated reference types (classes).
to explain more precisely,
if passing type across methods , update name variable, reflected automatically in caller.
private void sample(npx inn) { inn.name="changed name"; } var npx = new npx() { name = "a", priority = 3 }; sample(npx);
but, strings not
private void sample(string inn) { inn = "changed name"; } var npx = "aaaa"; sample(npx);
so consider using ref in heavy duty string processing methods. acceptable using ref keyword on reference type. because code review tools spit when feed kind of code.
what thoghts.??
even though string reference type, creates new string , allocates new memory each passed variable due immutability nature of it.
this wrong, misunderstanding of immutability means, , how reference types passed.
so consider using
ref
in heavy duty string processing methods.
you start false premise, , reach untenable conclusion.
i'll elaborate on first point bit.
immutability means can't mutate instance of string
. so, if hands instance of string
, can't change it*.
* there ways, let's ignore them here.
this not mean if pass instance of string
parameter in method instance copied.
all parameters in c# passed value (unless marked ref
or out
). reference types, value reference. passed value means copy of value made, , copy passed method. reference types, value reference. in particular, instances of string
s parameters methods, copy of reference string
passed callee.
so, instance of string
, copy of reference made. if mutate value of reference in body of method, caller doesn't see change. if somehow modify referent (you can't because string
s immutable), caller see change.
now, let's consider simple mutable reference type.
class foo { public int bar { get; private set; } public foo setbar(int bar) { this.bar = bar; return this; } }
and method
public void frob(foo foo, int bar) { foo.setbar(bar); }
and simple setup like
var foo = new foo { bar = 42 }; console.writeline(foo.bar); frob(foo, 17); console.writeline(foo.bar);
this print
42 17
to console. why? because frob
mutates referent.
now let's make simple immutable object.
class immutablefoo { private readonly int bar; private readonly string name; public int bar { { return this.bar; } } public string name { { return this.name; } } public foo(int bar, string name) { this.bar = bar; this.name = name; } public foo setbar(int bar) { return new foo(bar, this.name); } }
and method
public void frob(immutablefoo immutablefoo, int bar) { immutablefoo = immutablefoo.setbar(bar); }
and simple setup like
var foo = new foo(42, "fubar"); console.writeline("{0}, {1}", foo.bar, foo.name); frob(foo, 17); console.writeline("{0}, {1}", foo.bar, foo.name);
this print
42, fubar 42, fubar
to console. why? because frob
assigns new reference variable immutablefoo
in body of frob
, caller not see because value of reference foo
not change; not modified method frob
.
it works same string
. when have method like
private void sample(string inn) { inn = "changed name"; }
what happening here "changed name" referent string
interned on heap , assigning reference string
variable inn
in method sample
. in
var npx = "aaaa"; sample(npx);
the caller not see because npx
stills has same reference value before call. however, when say
private void sample(ref string inn) { inn = "changed name"; }
and
var npx = "aaaa"; sample(npx);
now case inn
in sample
alias npx
in caller. in case, reference modified, because 2 variables aliases same storage location.
Comments
Post a Comment