c# - Design Guidelines for Heavy string operations -


in current project, working on word processing & highlighting solution. typically every call class processing approximately 4000-5000 words long strings.

the structure looks this

       public string parsetext(string inn)        {            //heavy parsing            }         public string highlighttext(string inn)        {                 var cleaned =  cleanuptext(inn);                 var parsed =  parsetext(cleaned );                 //highlight algorithm                 var formatted=  parsetext(parsed);                 return formatted;         }         public string cleanuptext(string inn)        {            //heavy parsing            }        public string formattext(string inn)        {            //heavy parsing            } 

so feel passing around same string around different methods decreases performance. though string reference type, creates new string , allocates new memory each passed variable due immutability nature of it. different other user generated reference types (classes).

to explain more precisely,

if passing type across methods , update name variable, reflected automatically in caller.

       private void sample(npx inn) {             inn.name="changed name";        }        var npx = new npx() { name = "a", priority = 3 };        sample(npx); 

but, strings not

       private void sample(string inn) {             inn = "changed name";        }        var npx = "aaaa";        sample(npx); 

so consider using ref in heavy duty string processing methods. acceptable using ref keyword on reference type. because code review tools spit when feed kind of code.

what thoghts.??

even though string reference type, creates new string , allocates new memory each passed variable due immutability nature of it.

this wrong, misunderstanding of immutability means, , how reference types passed.

so consider using ref in heavy duty string processing methods.

you start false premise, , reach untenable conclusion.

i'll elaborate on first point bit.

immutability means can't mutate instance of string. so, if hands instance of string, can't change it*.

* there ways, let's ignore them here.

this not mean if pass instance of string parameter in method instance copied.

all parameters in c# passed value (unless marked ref or out). reference types, value reference. passed value means copy of value made, , copy passed method. reference types, value reference. in particular, instances of strings parameters methods, copy of reference string passed callee.

so, instance of string, copy of reference made. if mutate value of reference in body of method, caller doesn't see change. if somehow modify referent (you can't because strings immutable), caller see change.

now, let's consider simple mutable reference type.

class foo {     public int bar { get; private set; }     public foo setbar(int bar) { this.bar = bar; return this; } } 

and method

public void frob(foo foo, int bar) {     foo.setbar(bar); } 

and simple setup like

var foo = new foo { bar = 42 }; console.writeline(foo.bar); frob(foo, 17); console.writeline(foo.bar); 

this print

42 17 

to console. why? because frob mutates referent.

now let's make simple immutable object.

class immutablefoo {     private readonly int bar;     private readonly string name;     public int bar { { return this.bar; } }     public string name { { return this.name; } }     public foo(int bar, string name) { this.bar = bar; this.name = name; }     public foo setbar(int bar) { return new foo(bar, this.name); } } 

and method

public void frob(immutablefoo immutablefoo, int bar) {     immutablefoo = immutablefoo.setbar(bar); } 

and simple setup like

var foo = new foo(42, "fubar"); console.writeline("{0}, {1}", foo.bar, foo.name); frob(foo, 17); console.writeline("{0}, {1}", foo.bar, foo.name); 

this print

42, fubar 42, fubar 

to console. why? because frob assigns new reference variable immutablefoo in body of frob, caller not see because value of reference foo not change; not modified method frob.

it works same string. when have method like

private void sample(string inn) {     inn = "changed name"; } 

what happening here "changed name" referent string interned on heap , assigning reference string variable inn in method sample. in

var npx = "aaaa"; sample(npx); 

the caller not see because npx stills has same reference value before call. however, when say

private void sample(ref string inn) {     inn = "changed name"; } 

and

var npx = "aaaa"; sample(npx); 

now case inn in sample alias npx in caller. in case, reference modified, because 2 variables aliases same storage location.


Comments

Popular posts from this blog

python - Scipy curvefit RuntimeError:Optimal parameters not found: Number of calls to function has reached maxfev = 1000 -

c# - How to add a new treeview at the selected node? -

java - netbeans "Please wait - classpath scanning in progress..." -